Archlab Part A
Assignment 4: Y86-64 Programs
Due: Thursday April 29th, 11:59PM
Introduction
For this lab you will write some Y86-64 programs and become familiar with the Y86-64 tools. You will
also extend the SEQ simulator with a new instruction, iaddq, which will be helpful to you in writing your programs.
Logistics
You will work on this lab alone. Any clarifications and revisions to the assignment will be posted to the class website.
From one of the CS machines, download and extract the archlab.tar starter code.
cd cs224
wget https://cs224.cs.vassar.edu/labs/archlab.tar
tar xvf archlab.tar
cd archalb
Adding an Instruction to the SEQ Processor
You will be working in directory sim/seq for this part of the lab. Your task for this part is to extend the SEQ processor to support the iaddq V, rB instruction, which adds the constant value V to the destination register rB. To add this instruction, you will modify the file seq-full.hcl, which implements the version of SEQ described in the textbook. In addition, it contains declarations of some constants that you will need for your solution.
Byte | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
+----------------------------------------------------------------------------------------+
iaddq V, rB | C | 0 | F | rB | V |
+-----------------------------------------------------------------------------------------
Your HCL file must begin with a header comment containing the following information:
- Your name.
- A description of the computations required for the
iaddqinstruction. Use the descriptions ofirmovqandOPqshown below as a guide.
Instruction irmovq V, rB
Fetch icode:ifun <- M1[PC]
rA:rB <- M1[PC+1]
valC <- M8[PC+2]
valP <- PC + 10
Decode
Execute valE <- 0 + valC
Memory
Write back R[rB] <- valE
PC update PC <- valP
Instruction OPq rA, rB
Fetch icode:ifun <- M1[PC]
rA:rB <- M1[PC+1]
valP <- PC + 2
Decode valA <- R[rA]
valB <- R[rB]
Execute valE <- valB OP valA
Set CC
Memory
Write back R[rB] <- valE
PC update PC <- valP
Building and Testing Your Solution
Once you have finished modifying the seq-full.hcl file, then you will need to build a new instance of the SEQ simulator ssim based on this HCL file, and then test it:
Building a new simulator
You can use make to build a new SEQ simulator.
make VERSION=full
This builds a version of ssim that uses the control logic you specified in seq-full.hcl. To save typing, you can assign VERSION=full in the Makefile.
Testing your solution on a simple Y86-64 program
For your initial testing, I recommend running simple programs such as asumi.yo (for testing iaddq) in TTY (non gui) mode, comparing the results against the ISA simulation:
./ssim -t ../y86-code/asumi.yo
If the ISA test fails, then you should debug your implementation by single stepping the simulator in GUI mode:
./ssim -g ../y86-code/asumi.yo
Retesting your solution using the benchmark programs
Once your simulator is able to correctly execute small programs, then you can automatically test it on the Y86-64 benchmark programs in ../y86-code:
(cd ../y86-code; make testssim)
This will run ssim on the benchmark programs and check for correctness by comparing the resulting processor state with the state from a high-level ISA simulation. Note that none of these programs test the added instruction. You are simply making sure that your solution did not inject errors for the original instructions. See file ../y86-code/README file for more details.
Performing regression tests
Once you can execute the benchmark programs correctly, then you should run the extensive set of regression tests in ../ptest. To test everything except iaddq:
(cd ../ptest; make SIM=../seq/ssim)
To test your implementation of iaddq:
(cd ../ptest; make SIM=../seq/ssim TFLAGS=-i)
For more information on the SEQ simulator refer to the CS:APP3e Guide to Y86-64 Processor Simulators.
Y86-Programs
You will be working in directory archlab/misc for this part of the lab Your task is to write and simulate three Y86-64 programs. The required behavior of these programs is defined by the example C functions in examples.c. Be sure to put your name in a comment at the beginning of each program. You can test your programs by first assembling them with the program yas and then running them with the instruction set simulator yis.
In all of your Y86-64 functions, you must follow the x86-64 conventions for passing function arguments, using registers, and using the stack. This includes saving and restoring any callee-save registers (%rbx, %rbp, %r12, %r13, and %r14) that you use.
sum.ys: Iteratively sum linked list elements
Write a Y86-64 program sum.ys that iteratively sums the elements of a linked list. Your program should
consist code that sets up the stack structure, invokes a function, and then halts. For this program, we have provided starter code that does this for you. You should implement the Y86-64 code for a function sum_list that is functionally equivalent to the C sum_list function shown below.
/* linked list element */
struct ll_node {
long val;
struct ll_node *next;
};
/* sum_list - Sum the elements of a linked list */
long sum_list(struct ll_node *lst) {
long val = 0;
while (lst) {
val += lst->val;
lst = lst->next;
}
return val;
}
Test your program using the following three-element list:
# Linked List
.align 8
ele1:
.quad 0x00a
.quad ele2
.pos 48
ele2:
.quad 0x0b0
.quad ele3
.pos 64
ele3:
.quad 0xc00
.quad 0
rsum.ys: Recursively sum linked list elements
Write a Y86-64 program rsum.ys that recursively sums the elements of a linked list. This code should
be similar to the code in sum.ys, except that it should use a function rsum_list that recursively sums alist of numbers, as shown with the C function rsum_list shown below. Test your program using the same three-element list you used for testing list.ys.
/* rsum_list - Recursive version of sum_list */
long rsum_list(struct ll_node *lst) {
long val; // value for the node
long rest; // rest of the list
if (!lst) {
return 0;
}
else {
val = lst->val;
rest = rsum_list(lst->next);
return val + rest;
}
}
Note: when implementing this recursive function, pay special attention to callee saved registers. I recommend looking at the X86-64 assembly code for this function (shown below) to get a sense of the overall structure.
rsum_list:
testq %rdi, %rdi
jne .L8
movl $0, %eax
ret
.L8:
pushq %rbx
movq (%rdi), %rbx
movq 8(%rdi), %rdi
call rsum_list
addq %rbx, %rax
popq %rbx
ret
Notice how %rbx (which holds val) is used before and after the call to rsum_list. %rbx is a callee saved register so its value must not be changed by the call to rsum_list. To preserve the value of %rbx, the old value of %rbx is first pushed on the stack, and then restored by popping it off the stack before the return.
copy.ys: Copy a source block to a destination block
Write a program copy.ys that copies a block of words from one part of memory to another (non-overlapping) area of memory, computing the checksum (xor) of all the words copied.
Your program should consist of code that sets up a stack frame, invokes the function copy_block, and
then halts. The function should be functionally equivalent to the C function copy_block shown below.
/* copy_block - Copy src to dest and return xor checksum of src */
long copy_block(long *src, long *dest, long len) {
long result = 0;
while (len > 0) {
long val = *src++;
*dest++ = val;
result ˆ= val;
len--;
}
return result;
}
Test your program using the following three-element source and destination blocks:
.align 8
# Source block
src:
.quad 0x00a
.quad 0x0b0
.quad 0xc00
# Destination block
dest:
.quad 0x111
.quad 0x222
.quad 0x333
Evaluation
This lab is worth 100 points, 25 points for each Y86-64 solution program and 25 points for the successfully implementation of the iaddq instruction. Each solution program will be evaluated for correctness, including proper handling of the stack and registers, as well as functional equivalence with the example C functions in examples.c.
The programs sum.ys and rsum.ys will be considered correct if I do not spot any errors in them, and their respective sum_list and rsum_list functions return the sum 0xcba in register %rax.
The program copy.ys will be considered correct if I do not spot any errors in them, and the copy block function returns the sum 0xcba in register %rax, copies the three 64-bit values 0x00a, 0x0b0, and 0xc00 to the 24 bytes beginning at address dest, and does not corrupt other memory locations.
To get full credit for the implementation of the iaddq instruction, you must have a description of the
computations required for the iaddq instruction, similar to the descriptions in Figure 4.18 on page 387 of your textbook. In addition, you must also pass the benchmark regression tests in y86-code to verify that your simulator still correctly executes the benchmark suite, and pass the regression tests in ptest.
Handin Instructions
You will be handing in the following files:
sum.ysrsum.yscopy.ysseq-full.hcl
Make sure you have included your name in a comment at the top of each of your files.
To submit your files go to your archlab directory and type:
make submitA
You may submit as many times as you like. I will only look at the latest submission for grading.