Segmentation fault (core dumped) - C calling Assembly code for prime testing-CodePudding

I have an assembly language code I'm calling from C but I keep getting error: Segmentation fault (core dumped) I don't know what's the cause.

; this is the assembly code
section .text

global isPrime

isPrime: mov ecx, 2
.test:   cmp ecx, [rsi]
         jle .doif
         jg .prime

.doif:   mov eax, [rdi]
         mov edx, 0
         mov ebx, [rsi]
         div ebx
         cmp edx, 0
         jle .notPrime
         jg .checkAgain
.notPrime:
         mov eax, 1
         ret

.checkAgain:
         inc ecx
         jmp .test

.prime: mov eax, 0
         ret

The C code:

// C code

#include <stdio.h>

extern int isPrime(int *number, int *mValue);

int main() {
    int limit, m, input = 0;
    printf("Enter the limit of the prime numbers:");
    input = scanf("%d", &limit);
    while (input != 1) {
        printf("Not a number!\n");
        scanf("%*[^\n]");
        printf("Enter the limit of the prime numbers:");
        input = scanf("%d", &limit);
    }
    for (int i = 2; i <= limit;   i) {
        m = i / 2;
        int flag = isPrime(&i, &m); //this is what I'm trying to implement
        // for (int j = 2; j <= m; j  ) {
        //     if (i % j == 0) {
        //         printf("Number %d is not prime\n", i);
        //         flag = 1;
        //         break;
        //     }
        // }
        printf("%d\n", flag);
        if (flag == 0)
            printf("Number %d is prime\n", i);
    }
  return 0;
}

Error:

Enter the limit of the prime numbers:10
0
0
Segmentation fault (core dumped)

the commented part in the C code is what I want to write in assembly but got the error I mentioned above. From my research, I'm trying to write a memory address I do not have access to. The error is from assembly code but I don't know where exactly, please any possible solutions?

CodePudding user response：

System V x86_64 calling convention requires registers rbx, rsp, rbp, r12, r13, r14, r15 to be preserved during function call. You use ebx register as a part of rbx but don't preserve it.

Straightforward solution is to preserve it on stack:

bits 64
section .text

global isPrime

isPrime: push rbx
         mov ecx, 2
.test:   cmp ecx, [rsi]
         jle .doif
         jg .prime

.doif:   mov eax, [rdi]
         mov edx, 0
         mov ebx, [rsi]
         div ebx
         cmp edx, 0
         jle .notPrime
         jg .checkAgain
.notPrime:
         pop rbx
         mov eax, 1
         ret

.checkAgain:
         inc ecx
         jmp .test

.prime:  pop rbx
         mov eax, 0
         ret

Note: there are two return points, so you have to restore rbx before each return.

It is also possible to save register before modification for each iteration. It saves one instruction of code size at the cost of executing push/pop every time through the loop. div is slow, but this might be even slower on some modern CPUs. (Example how not to do):

...
.doif:   push rbx       ; don't do this, save/restore around the whole function
         mov eax, [rdi]
         mov edx, 0
         mov ebx, [rsi]
         div ebx
         pop rbx
...

Better solution is to use scratch register which is allowed to be modified instead of ebx.

         mov r8d, [rsi]
         div r8d

Or even divide by value in memory:

        div dword [rsi]

Some another improvements:

For performance reason you can load values into registers at the beginning.
jle instruction is redundant: CPU continue to execute correct branch if jump if greater is not executed.
cdq instruction may be used to extend value in eax into edx:eax. Note: it behaves differently from mov edx,0 because it extends sign bit. However we receive pointers to signed integers, so in theory we should be using cdq and idiv, not zero-extend and div. (You don't want cdq div; if your input was negative, div will treat it as huge unsigned, and fault when the quotient doesn't fit in 32 bits.)
For equality to zero test edx,edx may be used instead of cmp edx,0, these instructions are one byte shorter and equal or faster. Then jz/jnz conditional jump is more appropriate for semantic meaning, but the same machine instruction as je/jne.
A similar x86 peephole optimization is zeroing a register with xor eax, eax
You can increment ecx unconditionally. It saves one jump instruction.

But there is logic error in your original code: all divisions are by the same value. For prime test you should divide by all values sequentially. (Or just by odd values, after checking once for an even number, so you can increment your counter by 2.)

         div ecx

So final code is:

section .text

global isPrime

isPrime: mov edi, [rdi]      ; load the pointed-to integers
         mov esi, [rsi]      ;  better: pass by value in the first place
         mov ecx, 2          ; starting divisor, and anything signed-less-than this is assumed prime.

                           ; do {
.test:   cmp ecx, esi
         jg .prime           ; if(ecx > endpoint) return yes

         mov eax, edi
         xor edx, edx        ; zero-extend EAX into EDX:EAX, or use cdq/idiv for signed division
         div ecx
         inc ecx
         test edx, edx
         jnz .test         ; }while( n % ecx == 0)
     ;; composite if we left the loop this way
         mov eax, 1
         ret

.prime:  xor eax, eax      ; eax=0
         ret

You declared your C variables as signed int, but you're using unsigned div in asm. Unsigned makes the most sense; normally people don't talk about negative primes.

Passing a 2nd arg in ESI/RSI doesn't make much sense; the upper bound divisor limit is part of the prime-testing algorithm, not something the caller should need to calculate for it. mov esi, edi ; shr esi, 1.

You can actually stop much sooner for large numbers, at sqrt(n) instead of n/2. Or when n/divisor <= divisor, so you don't actually have to compute a square root. See a code-review answer for a loop that does that, and increments by 2.

CodePudding user response：

Just in case anyone is facing similar issue and needs help:

section .text
global isPrime

isPrime: mov ecx, 2
.test:   cmp ecx, [rsi]
         jle .doif
         jg .prime

.doif:   mov eax, [rdi]
         mov edx, 0
         div ecx
         cmp edx, 0
         jg .checkAgain
         mov eax, 1
         ret

.checkAgain:
         inc ecx
         jmp .test

.prime:  mov eax, 0
         ret