C11: A New C Standard Aiming at Safer Programming
Thirteen years after the ratification of the C99 standard, a new C standard is now available. Danny Kalev, a former member of the C++ standards committee, shares an overview of the goodies that C11 has to offer including multithreading support, safer standard libraries, and better compliance with other industry standards.
C11 is the informal name for ISO/IEC 9899:2011, the current standard for the C language that was ratified by ISO in December 2011. C11 standardizes many features that have already been available in common contemporary implementations, and defines a memory model that better suits multithreading. Put differently, C11 is a better C.
See also: The Biggest Changes in C++11 (and Why You Should Care)
Problems with the C99 Standard
C99, the previous C standard, brought about many new features including:
- Variable length arrays
- Designated initializers
- Type-generic math library
- New datatypes: long long, _Complex, _Bool
- restrict pointers
- Intermingled declarations of variables
- Inline functions
- One-line comments that begin with //
Alas, it hasn't been a huge success. Finding C99-compliant implementations is a challenge even today.
Where did C99 go awry? Some of its mandatory features proved difficult to implement in some platforms. Other C99 features were considered questionable or experimental, to such an extent that certain vendors even advised C programmers to replace C with C++.
Politics also played a role in the lukewarm reception of C99. It's no secret that the cooperation between the C and C++ standards committees in the late 1990s was lacking, to say the least. The good news is that today, the cooperation between the two committees is much better, and that the design mistakes of C99 were avoided in C11.
A New Standard, a New Hope?
C's security has always been a matter of concern. Insecure features – such as string manipulation functions that don't check bounds and file I/O functions that don't validate their arguments – have been a fertile source of malicious code attacks.
C11 tackles these issues with a new set of safer standard functions that aim to replace the traditional unsafe functions (although the latter are still available in C11). Additionally, C11 includes Unicode support, compliance with IEC 60559 floating-point arithmetic and IEC 60559 complex arithmetic, memory alignment facilities, anonymous structs and unions, the _Noreturn function specifier, and most importantly – multithreading support. Yes, I said the m-word!
Let's look at some of these features and others more closely.
For the typical C programmer, the biggest change in C11 is its standardized multithreading support. C of course has supported multithreading for decades. However, all of the popular C threading libraries have thus far been non-standard extensions, and hence non-portable.
The new C11 header file <threads.h> declares functions for creating and managing threads, mutexes, condition variables, and the _Atomic type qualifier. Another new header file, <stdatomic.h>, declares facilities for uninterruptible objects access. Finally, C11 introduces a new storage class specifier, _Thread_local (the C equivalent of C++11's thread_local). A variable declared _Thread_local isn't shared by multiple threads. Rather, every thread gets a unique copy thereof.
As an anecdote, if you're looking for someone to blame for the unwieldy keyword _Thread_local, blame Yours Truly. In the early 2000s, when the C++ standards committee began working on multithreading support, the original proposal for thread-local storage used the keyword __thread which I considered dangerous and opaque as it didn't clearly express the intent of the keyword (after all, __thread didn't create threads!), and might have conflicted with legacy code that happened to use __thread for user-declared identifiers. My proposal to change __thread to thread_local was accepted. thread_local has since percolated into other programming languages, including C11. Donations and hate mail alike are welcome!
Another C11 thread-related feature is the quick_exit() function that lets you terminate a program when exit() won't work, e.g., when cooperative cancellation of threads is impossible. The quick_exit() function ensures that functions registered with at_quick_exit() are called in the reverse order of their registration. After that, at_quick_exit() calls _Exit(), which doesn't flush the process's file buffers, as opposed to exit().
Anonymous structs and unions
An anonymous struct or union is one that has neither a tag name nor a typedef name. It's useful for nesting aggregates, e.g., a union member of a struct. The following C11 code declares a struct with an anonymous union and accesses the union's data member directly:
struct T //C++, C11
char * index;
struct T t;
t.key=1300; //access the union's member directly
C11 doesn't have templates yet but it does have a macro-based method of defining type-generic functions. The new keyword _Generic declares a generic expression that translates into type-dependent "specializations."
In the following example, the generic cubic root calculation macro cbrt(X) evaluates to the specializations cbrtl(long double), cbrtf(float) and the default cbrt(double), depending on the actual type of the parameter X:
#define cbrt(X) _Generic((X), long double: cbrtl,
How does it work? The parameter X translates into the specific type of the function argument. The compiler then selects the matching variant of cbrt(): cbrtl() if X is long double, cbrtf() for float, and cbrt() otherwise.
Memory Alignment Control
Taking after C++11, C11 introduces facilities for probing and enforcing the memory alignment of variables and types. The _Alignas keyword specifies the requested alignment for a type or an object. The alignof operator reports the alignment of its operand. Finally, the aligned_alloc() function.
void *aligned_alloc(size_t algn, size_t size);
allocates size bytes of memory with alignment algn and returns a pointer to the allocated memory.
The alignment features of C11 are declared in the new header file <stdalign.h>.
The _Noreturn Function Specifier
_Noreturn declares a function that does not return. This new functions specifier has two purposes: suppressing compiler warnings on a function that doesn't return, and enabling certain optimizations that are allowed only on functions that don't return.
_Noreturn void func (); //C11, func never returns
The Unicode standard defines three encoding formats: UTF-8, UTF-16, and UTF-32. Each has advantages and disadvantages. Currently, programmers use char to encode UTF-8, unsigned short or wchar_t for UTF-16, and unsigned long or wchar_t for UTF-32. C11 eliminates these hacks by introducing two new datatypes with platform-independent widths: char16_t and char32_t for UTF-16 and UTF-32, respectively (UTF-8 encoding uses char, as before). C11 also provides u and U prefixes for Unicode strings, and the u8 prefix for UTF-8 encoded literals. Finally, Unicode conversion functions are declared in <uchar.h>.
Unlike the #if and #error preprocessor directives, static assertions are evaluated at a later translation phase, when the type of the expression is known. Therefore, static assertions let you catch errors that are impossible to detect during the preprocessing phase.
Technical Report 24731-1, which is now an integral part of C11, defines bounds-checking versions of standard C library string manipulation functions. The bounds-checking versions have the _s suffix appended to the original function names.
For example, the bounds-checking versions of strcat() and strncpy() are strcat_s() and strncpy_s(), respectively. Most of the bounds-checking functions take an additional parameter indicating the size of the buffer they process. Many of them also perform additional runtime checks to detect various runtime exceptions.
Let's look at two famous string manipulation functions:
//C11, safe version of strcat
errno_t strcat_s(char * restrict s1,
const char * restrict s2);
strcat_s() copies no more than s1max bytes to s1. The second function, strcpy_s() requires that s1max should be bigger than the length of s2 (more precisely, s1max should be be greater than strnlen_s(s2, s1max)) in order to prevent an out-of-bounds write::
//C11, safe version of strcpy
errno_t strcpy_s(char * restrict s1,
const char * restrict s2);
Originally, all of the bounds-checking libraries were developed by Microsoft's Visual C++ team. The C11 implementation is similar but not identical.
gets() (declared in <stdio.h>) reads a line from the standard input and stores it in a buffer provided by the caller. gets() doesn't know the actual size of its buffer. Malicious software tools and crackers have often exploited this security loophole for generating buffer overflow attacks. Consequently, gets() was deprecated in C99. C11 removed it entirely, replacing it with a safer version called gets_s():
char *gets_s(char * restrict buffer, size_t nch);
gets_s() reads at most nch characters from the standard input.
New fopen() Interface
fopen(), a widely-used file I/O functions, gets a facelift in C11. It now supports a new exclusive create-and-open mode ("...x"). The new mode behaves like O_CREAT|O_EXCL in POSIX and is commonly used for lock files. The "...x" family of modes includes the following options:
- wx create text file for writing with exclusive access.
- wbx create binary file for writing with exclusive access.
- w+x create text file for update with exclusive access.
- w+bx or wb+x create binary file for update with exclusive access.
Opening a file with any of the exclusive modes above fails if the file already exists or cannot be created. Otherwise, the file is created with exclusive (non-shared) access. Additionally, a safer version of fopen() called fopen_s() is also available.
C11 attempts to fix what was broken in C99. It makes some of the mandatory features of C99 (variable length arrays, complex types and more) optional, and introduces new features that were already available in various implementations. Not less important, C11 designers worked closely with the C++ standards committee to ensure that the two languages should remain compatible as much as possible. Chances are good that unlike its predecessor, C11 will receive a warm reception. As a bonus, software written in C11 will be more robust against security loopholes and malware attacks.
Danny Kalev is a certified system analyst by the Israeli Chamber of System Analysts and software engineer specializing in C++. Kalev has written several C++ textbooks and contributes C++ content regularly on various software developers' sites. He was a member of the C++ standards committee and has a Master's degree in general linguistics.
The State of Code Quality 2016
The State of Code Review 2016 Report includes valuable insights into how development teams are collaborating to improve code quality in 2016.
In addition to code review, the report provides insight on the different tools teams are using to build and maintain software. The report also takes a deeper look at how development teams are organized in 2016 and how dev teams in organizations ranging from less than 5 employees to more 10,000 are approaching code quality.
Get the report!