Low-Level I/O, C++ Intro CSE 333 Spring 2018

Low-Level I/O, C++ Intro CSE 333 Spring 2018

L08: Syscalls, POSIX I/O System Calls, POSIX I/O CSE 333 Spring 2019 Instructor: Justin Hsia Teaching Assistants: Aaron Johnston Andrew Hu Daniel Snitkovskiy Forrest Timour Kevin Bi Kory Watson Pat Kosakanchit Renshu Gu Tarkan Al-Kazily Travis McGaha CSE333, Spring 2019 L08: Syscalls, POSIX I/O CSE333, Spring 2019 Administrivia Exercise 7 posted tomorrow, due Monday (4/22) Homework 1 due tomorrow night (4/18) Watch that hashtable.c doesnt violate the modularity of ll.h Watch for pointer to local (stack) variables

Use a debugger (e.g. gdb) if youre getting segfaults Advice: clean up to do comments, but leave step # markers for graders Late days: dont tag hw1-final until you are really ready Bonus: if you add unit tests, put them in a new file and adjust the Makefile Homework 2 will be released on Friday (4/19) 2 L08: Syscalls, POSIX I/O CSE333, Spring 2019 Lecture Outline C Stream Buffering System Calls POSIX Lower-Level I/O C++ Preview 3 L08: Syscalls, POSIX I/O CSE333, Spring 2019 Buffering By default, stdio uses buffering for streams: Data written by fwrite() is copied into a buffer allocated by stdio inside your process address space

As some point, the buffer will be drained into the destination: When you explicitly call fflush() on the stream When the buffer size is exceeded (often 1024 or 4096 bytes) For stdout to console, when a newline is written (line buffered) or when some other function tries to read from the console When you call fclose() on the stream When your process exits gracefully (exit() or return from main()) 4 L08: Syscalls, POSIX I/O CSE333, Spring 2019 Buffering Issues What happens if Your computer loses power before the buffer is flushed? Your program assumes data is written to a file and signals another program to read it? Performance implications: Data is copied into the stdio buffer Consumes CPU cycles and memory bandwidth

Can potentially slow down high-performance applications, like a web server or database (zero-copy) 5 L08: Syscalls, POSIX I/O CSE333, Spring 2019 Buffering Issue Solutions Turn off buffering with setbuf(stream, NULL) Unfortunately, this may also cause performance problems e.g. if your program does many small fwrite()s, each one will now trigger a system call into the Linux kernel Use a different set of system calls POSIX (OS layer) provides open(), read(), write(), close(), etc. No buffering is done at the user level But what about the layers below? The OS caches disk reads and writes in the FS buffer cache Disk controllers have caches too! 6 L08: Syscalls, POSIX I/O

CSE333, Spring 2019 Lecture Outline C Stream Buffering System Calls POSIX Lower-Level I/O C++ Preview 7 L08: Syscalls, POSIX I/O CSE333, Spring 2019 Whats an OS? Software that: Directly interacts with the hardware OS is trusted to do so; user-level programs are not OS must be ported to new hardware; user-level programs are portable Manages (allocates, schedules, protects) hardware resources Decides which programs can access which files, memory locations, pixels on the screen, etc. and when Abstracts away messy hardware devices

Provides high-level, convenient, portable abstractions (e.g. files, disk blocks) 8 L08: Syscalls, POSIX I/O CSE333, Spring 2019 OS: Abstraction Provider The OS is the layer below A module that your program can call (with system calls) Provides a powerful OS API POSIX, Windows, etc. open(), read(), write(), close(), connect(), listen(), read(), write(), ... etc process mgmt. virtual memory network stack OS File System Network Stack file system OS

API a process running your program Virtual Memory brk(), shm_open(), Process Management fork(), wait(), nice(), 9 L08: Syscalls, POSIX I/O CSE333, Spring 2019 OS: Protection System OS isolates process from each other But permits controlled sharing between them hardware directly Process D (trusted) Must prevent processes from accessing the Process C (untrusted) OS isolates itself from processes Process B (untrusted)

Through shared name spaces (e.g. file names) Process A (untrusted) OS is allowed to access the hardware User-level processes run with the CPU (processor) in unprivileged mode The OS runs with the CPU in privileged mode User-level processes invoke system calls to safely enter the OS OS (trusted) HW (trusted) 10 L08: Syscalls, POSIX I/O CSE333, Spring 2019 Process D (trusted) Process C (untrusted) Process B (untrusted) A CPU (thread of execution) is running userlevel code in Process A; the CPU is set to

unprivileged mode. Process A (untrusted) System Call Trace OS (trusted) HW (trusted) 11 L08: Syscalls, POSIX I/O CSE333, Spring 2019 Process D (trusted) Process C (untrusted) Process B (untrusted) Process A (untrusted) Code in Process A invokes a system call; the hardware then sets the CPU to privileged mode and traps into the OS, which invokes the appropriate system call handler.

system call System Call Trace OS (trusted) HW (trusted) 12 L08: Syscalls, POSIX I/O CSE333, Spring 2019 Process D (trusted) Process C (untrusted) Process B (untrusted) Because the CPU executing the thread thats in the OS is in privileged mode, it is able to use privileged instructions that interact directly with hardware devices like disks. Process A (untrusted) System Call Trace OS

(trusted) HW (trusted) 13 L08: Syscalls, POSIX I/O CSE333, Spring 2019 (1) Sets the CPU back to unprivileged mode and (2) Returns out of the system call back to the user-level code in Process A. Process D (trusted) Process C (untrusted) Process B (untrusted) Process A (untrusted) Once the OS has finished servicing the system call, which might involve long waits as it interacts with HW, it: system call return System Call Trace OS (trusted)

HW (trusted) 14 L08: Syscalls, POSIX I/O CSE333, Spring 2019 Useful reference: Process D (trusted) Process C (untrusted) Process B (untrusted) The process continues executing whatever code is next after the system call invocation. Process A (untrusted) System Call Trace OS (trusted) HW (trusted) CSPP 8.18.3 (the 351 book) 15 L08: Syscalls, POSIX I/O

CSE333, Spring 2019 Details on x86/Linux Your program A more accurate picture: Consider a typical Linux process Its thread of execution can be in one of several places: In your programs code In glibc, a shared library containing the C standard library, POSIX, support, and more In the Linux architecture-independent code In Linux x86-64 code C standard library POSIX glibc Linux system calls architecture-independent code architecture-dependent code

Linux kernel 16 L08: Syscalls, POSIX I/O CSE333, Spring 2019 Details on x86/Linux Your program Some routines your program invokes may be entirely handled by glibc without involving the kernel e.g. strcmp() from stdio.h There is some initial overhead when invoking functions in dynamically linked libraries (during loading) But after symbols are resolved, invoking glibc routines is basically as fast as a function call within your program itself! C standard library POSIX glibc architecture-independent code architecture-dependent code

Linux kernel 17 L08: Syscalls, POSIX I/O CSE333, Spring 2019 Details on x86/Linux Your program Some routines may be handled by glibc, but they in turn invoke Linux system calls e.g. POSIX wrappers around Linux syscalls POSIX glibc POSIX readdir() invokes the underlying Linux readdir() e.g. C stdio functions that read and write from files C standard library fopen(), fclose(), fprintf() invoke underlying Linux open(), close(), write(), etc.

architecture-independent code architecture-dependent code Linux kernel 18 L08: Syscalls, POSIX I/O CSE333, Spring 2019 Details on x86/Linux Your program Your program can choose to directly invoke Linux system calls as well Nothing is forcing you to link with glibc and use it But relying on directly-invoked Linux system calls may make your program less portable across UNIX varieties C standard library POSIX glibc architecture-independent code architecture-dependent code Linux kernel 19

L08: Syscalls, POSIX I/O CSE333, Spring 2019 Details on x86/Linux Your program Lets walk through how a Linux system call actually works Well assume 32-bit x86 using the modern SYSENTER / SYSEXIT x86 instructions x86-64 code is similar, though details always change over time, so take this as an example not a debugging guide C standard library POSIX glibc architecture-independent code architecture-dependent code Linux kernel 20 L08: Syscalls, POSIX I/O

CSE333, Spring 2019 Details on x86/Linux Remember our process address space picture? Lets add some 0xFFFFFFFF linux-gate.so Linux kernel stack kernel Stack Your program C standard library POSIX glibc details: Shared Libraries architecture-independent code Heap (malloc/free) Read/Write Segment .data, .bss Read-Only Segment .text, .rodata architecture-dependent code Linux kernel

CPU 0x00000000 21 L08: Syscalls, POSIX I/O CSE333, Spring 2019 Details on x86/Linux Process is executing your program code SP 0xFFFFFFFF linux-gate.so Linux kernel stack kernel Your program C standard library Stack POSIX glibc Shared Libraries architecture-independent code Heap (malloc/free) IP

Read/Write Segment .data, .bss Read-Only Segment .text, .rodata architecture-dependent code Linux kernel unpriv 0x00000000 CPU 22 L08: Syscalls, POSIX I/O CSE333, Spring 2019 Details on x86/Linux 0xFFFFFFFF linux-gate.so Linux kernel stack kernel Process calls into a glibc function e.g. fopen() Well ignore the messy details of loading/linking shared libraries SP IP

Your program C standard library Stack POSIX glibc Shared Libraries architecture-independent code Heap (malloc/free) Read/Write Segment .data, .bss Read-Only Segment .text, .rodata architecture-dependent code Linux kernel unpriv 0x00000000 CPU 23 L08: Syscalls, POSIX I/O CSE333, Spring 2019 Details on x86/Linux glibc begins the process of invoking a Linux system call glibcs

IP fopen() likely SP invokes Linuxs open() system call Puts the system call # and arguments into registers Uses the call x86 instruction to call into the routine __kernel_vsyscall located in linuxgate.so 0xFFFFFFFF linux-gate.so Linux kernel stack kernel Your program C standard library Stack POSIX glibc Shared Libraries architecture-independent code Heap (malloc/free) Read/Write Segment .data, .bss

Read-Only Segment .text, .rodata architecture-dependent code Linux kernel unpriv 0x00000000 CPU 24 L08: Syscalls, POSIX I/O CSE333, Spring 2019 Details on x86/Linux IP linux-gate.so is a vdso A virtual dynamically-linked SP shared object Is a kernel-provided shared library that is plunked into a process address space Provides the intricate machine code needed to trigger a system call 0xFFFFFFFF linux-gate.so Linux kernel stack kernel

Your program C standard library Stack POSIX glibc Shared Libraries architecture-independent code Heap (malloc/free) Read/Write Segment .data, .bss Read-Only Segment .text, .rodata architecture-dependent code Linux kernel unpriv 0x00000000 CPU 25 L08: Syscalls, POSIX I/O CSE333, Spring 2019 Details on x86/Linux linux-gate.so SP eventually invokes IP

the SYSENTER x86 instruction SYSENTER is x86s fast 0xFFFFFFFF linux-gate.so Linux kernel stack kernel Your program C standard library Stack POSIX glibc system call instruction Causes the CPU to raise its privilege level Traps into the Linux kernel by changing the SP, IP to a previouslydetermined location Changes some segmentation-related registers (see CSE451) Shared Libraries architecture-independent code Heap (malloc/free) Read/Write Segment

.data, .bss Read-Only Segment .text, .rodata architecture-dependent code Linux kernel priv 0x00000000 CPU 26 L08: Syscalls, POSIX I/O CSE333, Spring 2019 Details on x86/Linux The kernel begins SP executing code at IP the SYSENTER entry point Is in the architecture- 0xFFFFFFFF linux-gate.so Linux kernel stack kernel Your program C standard library

Stack glibc dependent part of Linux Its job is to: Look up the system call number in a system call dispatch table Call into the address stored in that table entry; this is Linuxs system call handler For open(), the handler is named sys_open, and is system call #5 POSIX Shared Libraries architecture-independent code Heap (malloc/free) Read/Write Segment .data, .bss Read-Only Segment .text, .rodata architecture-dependent code Linux kernel priv 0x00000000 CPU

27 L08: Syscalls, POSIX I/O CSE333, Spring 2019 Details on x86/Linux The system call handler executes What it does is SP IP system-call specific It may take a long time to execute, especially if it has to interact with hardware Linux may choose to context switch the CPU to a different runnable process 0xFFFFFFFF linux-gate.so Linux kernel stack kernel Your program C standard library

Stack POSIX glibc Shared Libraries architecture-independent code Heap (malloc/free) Read/Write Segment .data, .bss Read-Only Segment .text, .rodata architecture-dependent code Linux kernel priv 0x00000000 CPU 28 L08: Syscalls, POSIX I/O CSE333, Spring 2019 Details on x86/Linux Eventually, the SP system call handler IP finishes Returns back to the system call entry point 0xFFFFFFFF linux-gate.so

Linux kernel stack kernel Your program C standard library Stack glibc Places the system calls return value in the appropriate register Calls SYSEXIT to return to the user-level code POSIX Shared Libraries architecture-independent code Heap (malloc/free) Read/Write Segment .data, .bss Read-Only Segment .text, .rodata architecture-dependent code Linux kernel priv 0x00000000

CPU 29 L08: Syscalls, POSIX I/O CSE333, Spring 2019 Details on x86/Linux SYSEXIT transitions the processor back to usermode code Restores the IP, SP to SP user-land values Sets the CPU back to unprivileged mode IP Changes some segmentation-related registers (see CSE451) Returns the processor back to glibc 0xFFFFFFFF linux-gate.so Linux kernel stack kernel Your program C standard library Stack POSIX

glibc Shared Libraries architecture-independent code Heap (malloc/free) Read/Write Segment .data, .bss Read-Only Segment .text, .rodata architecture-dependent code Linux kernel unpriv 0x00000000 CPU 30 L08: Syscalls, POSIX I/O CSE333, Spring 2019 Details on x86/Linux glibc continues to execute Might execute more system calls Eventually SP returns back to your program code 0xFFFFFFFF linux-gate.so Linux kernel stack

kernel Your program C standard library Stack POSIX glibc Shared Libraries architecture-independent code Heap (malloc/free) IP Read/Write Segment .data, .bss Read-Only Segment .text, .rodata architecture-dependent code Linux kernel unpriv 0x00000000 CPU 31 L08: Syscalls, POSIX I/O CSE333, Spring 2019 strace

A useful Linux utility that shows the sequence of system calls that a process makes: bash$ strace ls 2>&1 | less execve("/usr/bin/ls", ["ls"], [/* 41 vars */]) = 0 brk(NULL) = 0x15aa000 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f03bb741000 access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory) open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3 fstat(3, {st_mode=S_IFREG|0644, st_size=126570, ...}) = 0 mmap(NULL, 126570, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f03bb722000 close(3) = 0 open("/lib64/libselinux.so.1", O_RDONLY|O_CLOEXEC) = 3 read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\300j\0\0\0\0\0\0"..., 832) = 832 fstat(3, {st_mode=S_IFREG|0755, st_size=155744, ...}) = 0 mmap(NULL, 2255216, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f03bb2fa000 mprotect(0x7f03bb31e000, 2093056, PROT_NONE) = 0 mmap(0x7f03bb51d000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED| MAP_DENYWRITE, 3, 0x23000) = 0x7f03bb51d000 32 L08: Syscalls, POSIX I/O CSE333, Spring 2019 If Youre Curious

Download the Linux kernel source code Available from http://www.kernel.org/ man, section 2: Linux system calls man 2 intro man 2 syscalls man, section 3: glibc/libc library functions man 3 intro The book: The Linux Programming Interface by Michael Kerrisk (keeper of the Linux man pages) 33 L08: Syscalls, POSIX I/O CSE333, Spring 2019 Lecture Outline C Stream Buffering System Calls POSIX Lower-Level I/O C++ Preview 34

L08: Syscalls, POSIX I/O CSE333, Spring 2019 C Standard Library File I/O So far youve used the C standard library to access files Use a provided FILE* stream abstraction fopen(), fread(), fwrite(), fclose(), fseek() These are convenient and portable They are buffered They are implemented using lower-level OS calls 35 L08: Syscalls, POSIX I/O CSE333, Spring 2019 Lower-Level File Access Most UNIX-en support a common set of lower-level file access APIs: POSIX Portable Operating System Interface open(), read(), write(), close(), lseek() Similar in spirit to their f*() counterparts from C std lib Lower-level and unbuffered compared to their counterparts Also less convenient

You will have to use these to read file system directories and for network I/O, so we might as well learn them now 36 L08: Syscalls, POSIX I/O CSE333, Spring 2019 open()/close() To open a file: Pass in the filename and access mode Similar to fopen() Get back a file descriptor Similar to FILE* from fopen(), but is just an int Defaults: 0 is stdin, 1 is stdout, 2 is stderr #include // for open() #include // for close() ... int fd = open("foo.txt", O_RDONLY); if (fd == -1) { perror("open failed"); exit(EXIT_FAILURE); } ... close(fd);

37 L08: Syscalls, POSIX I/O CSE333, Spring 2019 Reading from a File ssize_t ssize_t read(int read(int fd, fd, void* void* buf, buf, size_t size_t count); count); Returns the number of bytes read Might be fewer bytes than you requested (!!!) Returns 0 if youre already at the end-of-file Returns -1 on error (and sets errno) There are some surprising error modes (check errno) EBADF: bad file descriptor EFAULT: output buffer is not a valid address EINTR: read was interrupted, please try again (ARGH!!!! )) And many others

38 L08: Syscalls, POSIX I/O CSE333, Spring 2019 One way to read() bytes Which is the correct completion of the blank below? Vote at http://PollEv.com/justinh char* buf = ...; // buffer of size n int bytes_left = n; int result; // result of read() while (bytes_left > 0) { result = read(fd, ______, bytes_left); if (result == -1) { if (errno != EINTR) { // a real error happened, // so return an error result } // EINTR happened, // so do nothing and try again continue; } bytes_left -= result; } A. buf B. buf + bytes_left C. buf + bytes_left - n D. buf + n - bytes_left E. Were lost 39

L08: Syscalls, POSIX I/O CSE333, Spring 2019 One method to read() bytes int fd = open(filename, O_RDONLY); char* buf = ...; // buffer of appropriate size int bytes_left = n; int result; while (bytes_left > 0) { result = read(fd, buf + (n - bytes_left), bytes_left); if (result == -1) { if (errno != EINTR) { // a real error happened, so return an error result } // EINTR happened, so do nothing and try again continue; } else if (result == 0) { // EOF reached, so stop reading break; } bytes_left -= result; } close(fd); readN.c 40 L08: Syscalls, POSIX I/O CSE333, Spring 2019 Other Low-Level Functions

Read man pages to learn about: write() write data #include fsync() flush data to the underlying device #include opendir(), readdir(), closedir() deal with directory listings Make sure you read the section 3 version (e.g. man 3 opendir) #include A useful shortcut sheet (from CMU): http://www.cs.cmu.edu/~guna/15-123S11/Lectures/Lecture24.pdf 41

Recently Viewed Presentations

  • 6th Grade Student-led Conferences

    6th Grade Student-led Conferences

    My favorite part of Math class is. I am great at. I am struggling with . My goal for math this quarter is. Language Arts. ... UNIFIED ARTS and CONNECT. My 1st UA class is and I am learning about....
  • Marco Polos journey Pace of exploration quickened in

    Marco Polos journey Pace of exploration quickened in

    Frustrated by the rejections, he turned to the king of Spain 18-year old Emperor Charles I found the idea intriguing But after being disappointed with Columbus, the king's advisors were against it
  • PLANTS - King Edward Medical University

    PLANTS - King Edward Medical University

    ABRUS. Autopsy Findings. Mucosal inflammation. Oral cavity blisters. Crushed seeds in stomach. Hemorrhages in organs. MARKING NUT
  • Notice of Proposed Rulemaking: At-a-Glance

    Notice of Proposed Rulemaking: At-a-Glance

    Given the nearly three year gap between the implementation of the current fee schedule (FY 2018 Patent Fee Rule) and the anticipated effective date of this proposed fee setting effort (January 2021), a five percent increase to fees is similar...
  • CSE 544 - cs.stonybrook.edu

    CSE 544 - cs.stonybrook.edu

    Grading - assignments. 50% assignments. 6 assignments. 5-6 problems per assignment. Collaboration is allowed (groups of at most 3 students) One write-up per group. DO NOT COPY across groups. Assignments due . at the beginning of. class. NO LATE SUBMISSIONS....
  • Welcome to Flex website - sss.sd33.bc.ca

    Welcome to Flex website - sss.sd33.bc.ca

    You will need to take 4 screenshots on my blueprint, one corresponding to each book on moodle. The first is of your who am I page. Ensure that the screenshot has your name and all the quizzes are complete. You...
  • Are they really out to get us? Examining

    Are they really out to get us? Examining

    Social Category & SRM Smokers & Nonsmokers 4-person groups (g = 24) 3 group compositions Zero-acquaintance Self-perception 9 evaluative adjectives 5-point response scale 10-minute interactions Evaluation & metaperception Round-robin Social Interaction Structure Data Structure What I expected to see Perceiver...
  • ERCOT Template

    ERCOT Template

    Program Management Office August 11, 2008 Nodal Q&A Nodal Reporting Structure Program Management Office Program Management Office Janet Ply Tracks status of data availability and quality across program Determines impacts across the program when data is not available or of...