Maintain Safety with Unsafe and C-interoperable Rust

Adding Rust to your project’s technology stack can benefit your software in the future, as Rust offers excellent interoperability with other languages. Thus, your developers can experience the best of several languages (by using C/C++ libraries directly in a Rust application, for example) without needing to rewrite legacy code.

However, implementing this feature means you have to use unsafe Rust, which turns off many of Rust’s security checks and mechanisms. That’s why switching to unsafe Rust is a hard “no” for many developers. Can you mitigate these issues?

In this article, we share Apriorit’s experience of working with unsafe and interoperable Rust, examining common issues of these Rust features and offering practical solutions. This article will be useful for development leaders who want to combine Rust with other languages without introducing vulnerabilities to their software.

Contents:

Benefits of Rust interoperability
What is unsafe Rust?
Creating a small Rust library with the Foreign Function Interface
How to manage an unsafe interop Rust library
1. Prevent memory leaks
2. Avoid dangling pointers
3. Mitigate null-pointer dereferencing
4. Watch out for stack unwinding across the FFI boundary
Should you use unsafe Rust? Apriorit’s perspective
Conclusion

Benefits of Rust interoperability

Rust interoperability allows developers to use the Foreign Function Interface (FFI) to call Rust functions from other languages and call functions written in other languages from Rust. Rust is interoperable with C, C++, Python, JavaScript, Go, and many other languages. Being able to mix languages within an application provides developers with many possibilities:

Add Rust to a project without rewriting code. Rust offers many unique features for memory safety, concurrency, and compile-time checks, and developers who code in other languages often want to leverage them. While migrating an existing solution to a new language is a challenge, Rust’s interoperability makes it easier.
Access language-specific libraries. Rust already has a large ecosystem of tools, but with interoperability, developers can use libraries unique to other languages without making any major changes to their code. For example, almost all operating system libraries are written in C, including the widely used libc and WinAPI. Rust’s interoperability feature allows developers to use such system libraries in their Rust applications.
Optimize code performance. Rust generally provides great performance, but in some tasks, it’s slower than C/C++. And in some operations, C/C++ can be slower than Rust. Rust and C++ interoperability allows developers use both languages to maximize code security and performance of their app. This can come in handy in resource-constrained environments or some performance-critical applications.
Add memory safety to existing apps. While Rust code is memory safe by default, other languages struggle with memory safety vulnerabilities. Interop allows developers who use mainly C++, Python, and other languages to add memory checks to their code and make their applications more secure and efficient.

Ultimately, Rust interop capabilities allow you to reduce your development time and budget by taking the best features of several programming languages without figuring out a complex solution.

Plan on adding Rust to your project?

Get the most of Rust’s interoperability with thoroughly planned and executed app development by top Aprorit engineers.

What is unsafe Rust?

Using interoperability almost always requires writing unsafe code, as even calling an external function is considered an unsafe operation because the Rust compiler has no way to verify whether this function adheres to Rust’s safety standards. Also, the most common way of passing data between Rust and C is through pointers, and dereferencing a raw pointer is considered unsafe.

Unsafe Rust is a subset of the programming language that allows developers to skip some safety checks in their application and, for example, dereference a raw pointer, call unsafe functions, or mutate a static variable. Such code is still subjected to some of Rust’s safety checks, including the borrow checker, but the developer is responsible for verifying the security of unsafe code.

To use these additional features, you simply have to use the unsafe keyword and start a new block.

Here is a simple example of a Rust program that uses unsafe features:

Rust

fn main() { 
    let greeting = "Hello, world!".to_string();      

    let greeting_ptr = &greeting as *const _; 

    unsafe { 
        println!("{}", *greeting_ptr); 
    } 
}

This code contains a dereferenced raw pointer, which is considered unsafe because Rust cannot guarantee that it points to valid data. In our example, however, it’s perfectly safe, as we are sure that the greeting variable wasn’t dropped before dereferencing.

Here’s another example of Rust’s unsafe code that has undefined behavior:

Rust

fn main() { 
    let greeting_ptr;  

    { 
        let greeting = "Hello, world!".to_string(); 
        greeting_ptr = &greeting as *const _; 
    } 

    unsafe { 
        println!("{}", *greeting_ptr); 
    } 
}

In this case, we dropped the greeting variable before dereferencing the pointer, so our pointer points to a freed chunk of memory that could have already been reused by the allocator. We created a dangling pointer, and Rust didn’t stop us from doing this because we used unsafe. In safe Rust, the borrow checker would have stopped us from taking a reference to a short-lived variable altogether.

Rust and C/C++ interop inherits all the drawbacks of unsafe Rust and gives developers more things to consider. We’ll examine them later in the article, but first, let’s create a Rust sample library that we can call from C/C++ code and show how unsafe Rust can produce vulnerable code.

Creating a small Rust library with the Foreign Function Interface

Say we want to build a small Rust library callable from other languages that returns a string saying “Hello, world!” Returning a dynamically allocated string requires some manual memory management to prevent dangling pointers and memory leaks.

Our Rust library will look like this:

Rust

use std::ffi::CString; 

use libc::c_char; 

 

#[no_mangle] 
pub extern "C" fn hello_world() -> *const c_char { 
    CString::new("Hello, world!") 
        .expect("shouldn't fail") 
        .into_raw() 
} 

#[no_mangle] 
pub unsafe extern "C" fn free_string(allocated_string: *const c_char) { 
    if !allocated_string.is_null() { 
        let _ = CString::from_raw(allocated_string as *mut _); 
    } 
}

The hello_world function returns an array of type c_char. To avoid dealing with plain i8 values, we can use Rust’s abstraction over null-terminated C-style strings with CString. CString provides a convenient API that has a few helper methods that make it a lot easier to return a string from Rust to C.

For example, the CString::into_raw method allows memory leaks by making sure that Rust won’t automatically free allocated memory when a variable goes out of scope. The developer has to free a method after using it.

We also annotated the hello_world function with the #[no_mangle] attribute that prevents name mangling. Mangling changes the name of a function or variable to something more unique to avoid name collisions. In our case, we disabled mangling because with it, we wouldn’t know the function’s name and wouldn’t be able to call it. Adding the #[no_mangle] attribute ensures that the hello_world function has the exact name we give it.

Needing to manually free memory is the reason why we need the free_string function. It takes a string as input and uses the CString::from_raw method to convert back to an owned value. When the variable associated with this value is dropped at the end of the scope, the value is freed automatically.

In our example, we simply ignore the returned value using the wildcard pattern to drop it immediately. Notice that this function is marked as unsafe because we work with raw pointers and have to make sure that it’s safe by checking for null pointers.

The Cargo.toml file, which contains metadata for package compilation, should look like this:

Rust

[package] 
name = "simple-ffi-library" 
version = "0.1.0" 
edition = "2021" 
build = "build.rs" 

dependencies] 
libc = "0.2.147" 

[lib] 
crate-type = ["cdylib"] 

[build-dependencies] 
cbindgen = "0.26.0"

The build field of the package section specifies a build file that can be used to do some additional things during the build phase. In our case, we want to use the cbindgen crate together with a build script to automatically create header files for our FFI interface. The build script looks like this:

Rust

use std::env; 

use cbindgen::{self, Config}; 

fn main() { 
    let crate_dir = env::var("CARGO_MANIFEST_DIR").unwrap(); 
    let mut config = Config::default(); 
    config.namespace = Some(String::from("ffi")); 
    let output_file = { 
        let directory = env::var("CARGO_TARGET_DIR").unwrap_or(crate_dir.clone() + "/target"); 
        format!("{}/{}.hpp", directory, env::var("CARGO_PKG_NAME").unwrap()) 
    }; 
    cbindgen::Builder::new() 
        .with_crate(crate_dir) 
        .with_config(config) 
        .generate() 
        .expect("Bindings generation failed") 
        .write_to_file(output_file); 
}

This build script creates a header file named <Package Name>.hpp inside the target directory when the library is compiled. We can set the CARGO_TARGET_DIR environment variable to create it in any other place. We can also configure a namespace so that our exported functions won’t conflict with existing functions. Let’s call this namespace “ffi.”

Building our library is as simple as running the cargo build command. Now we have a compiled library and a header file generated by cbindgen.

All that’s left to do is link our library and call the exposed functions from the C/C++ code:

Rust

#include <article-ffi.hpp> 
#include <iostream> 

void main() 
{ 
    const char *string_from_rust = ffi::hello_world(); 
    std::cout << string_from_rust << std::endl; 
    ffi::free_string(string_from_rust); 
}

The important part here is remembering to call the free_string method after we are done with the string. Forgetting to do so will result in a memory leak. Even though memory leaks are memory-safe, they can degrade app performance and cause crashes, so it’s best to avoid them.

Now we can compile and run the code:

Rust

$ ./bin/main.exe 
Hello, world!

Even though the provided example is safe and works as expected, it could have gone wrong in many ways. For example, we could simply have forgotten to free the memory. Or we could have used the CString::as_ptr method, which also returns a pointer to the underlying string but does nothing to prevent Rust from freeing the memory when the string is dropped at the end of the function, creating a dangling pointer.

Now that we better understand interop and unsafe Rust, let’s analyze common issues they create and ways you can avoid them.

Related project

Custom Cybersecurity Solution Development: From MVP to Support and Maintenance

We helped a security software provider build a competitive MVP for detecting suspicious user behavior. With our help, the client launched their product, tested their business idea, turned the MVP into a comprehensive platform, and tailored it to reach regional and industry-specific markets.

Project details

Custom Cybersecurity Solution Development

How to manage an unsafe interop Rust library

Everything that can go wrong will eventually go wrong according to Murphy’s law. Your development team should be prepared to face any issues that unsafe Rust can throw at you.

The following best practices are enough to help your team avoid issues by enforcing a consistent code style and guidelines. Here’s what you should do when using unsafe Rust:

Document each unsafe function and block. It’s a common practice to comment on each block of unsafe code, explain that this block is in fact safe, and show the reasoning behind using unsafe features. This pattern is so common that Rust’s linter Clippy issues a warning whenever an unsafe function or block is undocumented.
Minimize the amount of unsafe code. Use unsafe Rust only when it’s absolutely necessary. The rule of thumb is to do everything possible to make unsafe blocks as small as possible. Unsafe Rust should be used responsibly — not as an escape hatch to do things the C/C++ way.
Create safe wrappers over unsafe code. If you want to include unsafe code in your library, create safe wrappers around unsafe code to reduce the overall amount of unsafe code and make it possible to use the library without unsafe blocks or functions. Make sure to verify that the safe wrapper is indeed safe.

These practices will help improve the general quality and safety of your code. However, there are some specific issues of unsafe Rust that you need to address directly. Here are key vulnerabilities to pay attention to:

Best practices to manage an unsafe interop Rust library

1. Prevent memory leaks

Rust usually frees dynamically allocated memory automatically after the application is done using it. In unsafe Rust, this feature is turned off and you need to manually free the memory. The forgotten memory remains unavailable and can’t be reused until the application terminates, thus consuming more resources and reducing application performance. For resource-constrained environments like embedded devices, memory leaks can be critical and even cause crashes.

As we have already seen, you can easily create a memory leak when dealing with FFI and passing objects between two languages. Tracking ownership and understanding who should free each allocated object can get tedious. Follow these rules to prevent memory leaks:

Clarify ownership boundaries. Decide which part of your app owns specific memory and deallocates it. This will also help with avoiding double free errors. Note these rules in the comments and documentation to avoid confusion and errors.
Let each language manage its own allocations. In general, you shouldn’t free C’s memory from Rust, or vice versa. Each language has its own allocator, and using one from another language can lead to issues like double-free errors.
Use object-based APIs when possible. Such APIs can help you define clear boundaries between safe and unsafe code and make it easier to track relationships between objects.

Following these rules will help you avoid some errors but doesn’t eradicate the risk of a few memory leaks during development. The important part is being able to detect and fix them using memory profiling tools like Valgrind, which detects memory leaks and tracks memory use during testing. It’s always recommended to extensively test your application for memory leaks before releasing it.

2. Avoid dangling pointers

A dangling pointer points to some memory that has already been deallocated and leads to undefined behavior. Safe Rust protects you from this issue by disallowing dereferencing of raw pointers and enforcing lifetimes at compile time, but dereferencing a raw pointer is one of unsafe Rust’s features.

Avoiding dangling pointers is fully up to a developer. Both Rust and C can do nothing to stop developers from creating and using dangling pointers in their code. So developers need to avoid common causes of dangling pointers, such as returning a pointer to a local variable, forgetting about Rust’s automatic deallocation at the end of the scope, and not using heap allocated data when data is expected to live between function calls.

To work with heap allocation, you can use Box, which provides the into_raw method to transfer memory management to the caller, thereby preventing automatic deallocation. Releasing memory allocated by Box is pretty easy and requires a single call to the Box::from_raw method, which accepts a pointer and transfers memory management back to Box’s destructor.

You can also use the previously mentioned object-based APIs and expose methods that are responsible for memory allocation and deallocation. A clean and consistent API design will make it harder to create dangling pointers in code.

3. Mitigate null-pointer dereferencing

Null-pointer dereferencing can happen when a Rust function expects a pointer to some data as one of its arguments. The caller can easily pass a null pointer instead of a valid one. Dereferencing such a pointer leads to undefined behavior.

When it comes to avoiding null-pointer dereferencing, the rule of thumb is to verify that every incoming pointer is not a null pointer and return an error if it is. Verifying each incoming pointer can get tedious, so you can use a simple macro to reduce code duplication:

Rust

macro_rules! not_null { 
    ($x:expr) => { 
        if $x.is_null() { 
	     // Return an appropriate error here 
        } 
    }; 
}

4. Watch out for stack unwinding across the FFI boundary

Stack unwinding happens when a Rust program panics during execution, leading to the unwinding of the stack. This happens because Rust’s implementation of unwinding wasn’t designed to be compatible with any other language’s unwinding strategy. Unwinding from Rust into another language, as well as unwinding from another language into Rust, leads to undefined behavior.

You can catch stack unwinds in your Rust’s FFI-exposed functions using the built-in std::panic::catch_unwind function. This function allows you to catch panics at runtime, prevent stack unwinding, and handle errors gracefully.

Also, it’s best to avoid C++ functions that can throw exceptions, as unwinding into Rust also causes undefined behavior.

Should you use unsafe Rust? Apriorit’s perspective

Considering all that we have discussed, you may be hesitant to use unsafe Rust in your projects, as it may seem even harder than using C or C++ directly. Rust makes you think about how your code works before and while writing it, not after. You have to put in effort to satisfy the borrow checker and do everything the Rust way.

In our opinion, this effort pays off in the end, as you get a safe and performant Rust application while saving development time with interop functions and combining the best of two languages.

Note that unsafe Rust is still safer than C/C++ because it still enforces borrow checks at compile time. For example, unsafe Rust prevents us from doing ridiculous things like this one:

Rust

unsafe fn func() -> &'static [u8] { 
    &vec![0xff; 64] 
} 

fn main() { 
    unsafe { 
        println!("{:?}", func()); 
    } 
}

But this works fine in C++:

C++

void main() 
{ 
    int *bad_ptr; 
    { 
        int value = 10; 
        bad_ptr = &value; 
    } 
    // bad_ptr is a dangling pointer 
}

The obvious disadvantage of unsafe Rust is that it is extremely verbose when compared to C and looks a lot more complex than safe Rust. If you find yourself mostly using unsafe Rust and don’t need Rust’s safety-related features like the ownership system and borrow checker, then it’s probably better to switch to C/C++ instead. But in all other cases, unsafe Rust is usually a better choice, as it provides enhanced safety compared to C/C++, a vast collection of production-grade libraries, and mechanisms for managing safety issues.

Conclusion

Using Rust and C++ interoperability is a great option when you need to add libraries or functions from another language to your Rust code without using sketchy APIs, building custom functionality, or compromising on your app’s security and performance. Working with unsafe Rust is more challenging than working with its safe counterpart, but all of the issues it creates can be solved — it just requires your team to understand the peculiarities of Rust and be attentive.

And if dealing with these peculiarities on your own seems too difficult, you can always turn to us! At Apriorit, we have a team of Rust experts that regularly deliver Rust-based firmware, cybersecurity solutions, and other products. They know how to plan and implement the solution you need in a way that delivers all of Rust’s benefits while mitigating its vulnerabilities.

Want to painlessly introduce Rust to your product?

Get an expert development team that knows how to use interop Rust and ensure your code’s security and performance.