Reduce memory footprint by about 600% for M.E.D. — Performance Matters

Published in

Level Up Coding

6 min readJun 30, 2023

Weekend build and learn

In our last weekend build and learn, I share why and how I build the M.E.D. — a Rust-powered Data Masking, Encryption, and Decryption CLI tool.

Build a CLI Tool for Data Masking, Encryption, and Decryption With Rust

Weekend build and learn

betterprogramming.pub

As of today, there are already around 600 users who have downloaded and tried out the M.E.D. Suppose you are also interested in giving it a try. The download link is available below.

Releases · jayhuang75/rust-cli-med

M.E.D. - A Rust-powered command-line data masking, encryption, and decryption tool - Releases · jayhuang75/rust-cli-med

github.com

How to use it? You can read this article

Introduction M.E.D.

A Data Masking, Encryption, and Decryption CLI tool powered by Rust

blog.devgenius.io

The current version is 0.5.9, and its functionality based on the original design and use case is getting stable and mature. It should be close to the 1.0.0 release soon.

This weekend build and learn. I want to share one breaking change story during version 0.3.1. And the issue is as below.

Runtime Memory optimization when the file size is over 2.5G in the small size machine · Issue #22 ·…

M.E.D. - A Rust-powered command-line data masking, encryption, and decryption tool - Runtime Memory optimization when…

github.com

I will walk you through how I identify the issues, find out where the bottleneck is, and how I fix/improve it.

Problem statement

During the performance testing, the app crashed when processing the file over 2.5 G in my OneNet Book 4. And I realized this would be a bigger problem than I thought.

Let's find out

Let's open the htop — an interactive system-monitor process-viewer and process manager, to see what the runtime looks like in a 16G RAM Macbook Pro.

When it is on the step "read" the CSV files, you can see that the memory Swp has almost reached the limit of 16G.

htop runtime screenshot — credit by author

And the memory usage in runtime is around

the top runtime screenshot — credit by author

However, when it is the "write" stage, the memory Swp back to a lower level of consumption.

So what's the "Swp"

Swap (SWP) is a particular file-backed region for that scratch memory. Creating swap space will allow the operating system to move that scratch memory to the disk instead of (utilized by more running processes) shared libraries, generally improving performance.

Basically, Swap(SWP) serves two roles:

To move out less used 'pages' out of memory into storage so memory can be used more efficiently
If memory is insufficient, it acts to "add" memory.

Suppose it's the #1 case; it will be ok.

In the #2 case, here are two possible scenarios.

You'd have increased disk use. If your disks aren't fast enough to keep up, your system might thrash, and you'd experience slowdowns as data is swapped in and out of memory. This would result in a bottleneck, leading to the system not responding.
You run out of memory, resulting in weirdness and crashes.

In our case, our program DID NOT utilize the memory efficiently, especially when the read CSV stage.

Now we know where. In the next step, let's find out why the memory gets fully Swp.

Original Design

Initially, I broke it down into four steps:

new — initial the new processor
load — load the files to the memory
run — run the masking/encryption/decryption
write — write back to the output location.

The issue is on #2, based on the htop analysis.

Dive into the code

async fn load(&mut self, num_workers: &u16, file_path: &str) -> Result<(), MedError> {
  ...
  for entry in WalkDir::new(file_path)
            .follow_links(true)
            .into_iter()
            .filter_map(|e| e.ok())
            .filter(|e| !e.path().is_dir())
  {
    ...
    new_worker.pool.execute(move || {
      read_csv(tx, entry.path().display().to_string()).unwrap();
    });
  }
  ...

  Ok(())
}

pub fn read_csv(tx: flume::Sender<CsvFile>, path: String) -> Result<(), MedError> {
    ...
    let mut data: Vec<StringRecord> = Vec::new();
    ...
    reader.records().for_each(|record| {
        match record {
            Ok(r) => {
                total_records += 1;
                data.push(r);
            }
            Err(err) => {
               ...
            }
        };
    });
    ...
    Ok(())
}

The code translates to plain English.

Loop over the files in the directory, and put them into the worker queue for execution concurrently.
For each read file execution, read/push the file content(String) to the Vec for the future #3 [run] execution (Masking/Encrypt/Decrypt)

And clearly, the Vec size is depended on the file size, so the memory gets over Swp, leading to the application crash or the system not responding.

The solution

Once we identify the root cause of the memory over the Swp issue, it will be easy for me to reshape the solution. And the principal will be:

Avoid holding the struct in memory and process it once we read it.

The benefit will be avoiding large memory allocation and reducing unnecessary process steps, increasing the simplicity of code-level implementation.

async fn load(&mut self) -> Result<Metrics, MedError> {
        ...
        // loop over the files path
        for entry in WalkDir::new(&self.runtime_params.file_path)
            .follow_links(true)
            .into_iter()
            .filter_entry(is_not_hidden)
            .filter_map(|e| e.ok())
            .filter(|e| !e.path().is_dir())
        {
            ...
            match self.runtime_params.file_type {
                FileType::CSV => {
                    new_worker.pool.execute(move || {
                        csv_processor(tx_metadata, &files_path, &output_dir, process_runtime)
                            .unwrap();
                    });
                }
                FileType::JSON => {
                    new_worker.pool.execute(move || {
                        json_processor(tx_metadata, &files_path, &output_dir, process_runtime)
                            .unwrap();
                    });
                }
            }
        }
        ...
        rx_metadata.iter().for_each(|item| {
            /// update the metrics
        });
        Ok(self.metrics.clone())
}

pub fn csv_processor(
    tx_metadata: flume::Sender<Metadata>,
    files_path: &str,
    output_path: &str,
    process_runtime: ProcessRuntime,
) -> Result<(), MedError> {
    ...
    reader.into_records().for_each(|record| {
        match record {
            Ok(records) => {
               /// 1. Read by record
               /// 2. Masking, Encryption, and Decryption
               /// 3. Write
            }
            Err(err) => {
                ...
            }
        };
    });
    ...
    tx_metadata
        .send(Metadata {
            total_records,
            failed_records,
            record_failed_reason,
        })
        .unwrap();
    Ok(())
}

Output

Testing in the new implementation, the Memory Swap (SWP) remains in the health stage under 5G RAM.

And the runtime memory usage is only around ~ 2000K (max 3 M.B.).

Top runtime screenshot — credit by author

Compared with before — the runtime memory footprint drop from ~ 20G to 3 MB, that's almost a 666% reduction, which provides excellent capabilities running in the smaller size infra.

Just imagine the cost of VM if this is running in the cloud ENV, which demonstrates again — Performance Matters!

This version of the fixed has been released on the 0.5.2 version.

Also, I have done some benchmarking base on the file size.

Model Name: MacBook Pro
Processor Name: 6-Core Intel Core i7
Processor Speed: 2.6 GHz
Total Number of Cores: 6
Memory: 16 GB

Performance test capture table — credit by author

Thank you for your reading and support. The Weekend build and learn, carry on.

Please follow me on Medium if you like the Weekend Build and Learn.

Join Medium with my referral link - Wei Huang

As a Medium member, a portion of your membership fee goes to writers you read, and you get full access to every story…

jayhuang75.medium.com

Reduce memory footprint by about 600% for M.E.D. — Performance Matters

Build a CLI Tool for Data Masking, Encryption, and Decryption With Rust

Weekend build and learn

Releases · jayhuang75/rust-cli-med

M.E.D. - A Rust-powered command-line data masking, encryption, and decryption tool - Releases · jayhuang75/rust-cli-med

Introduction M.E.D.

A Data Masking, Encryption, and Decryption CLI tool powered by Rust

Runtime Memory optimization when the file size is over 2.5G in the small size machine · Issue #22 ·…

M.E.D. - A Rust-powered command-line data masking, encryption, and decryption tool - Runtime Memory optimization when…

Problem statement

Let's find out

So what's the "Swp"

Original Design

Dive into the code

The solution

Output

Join Medium with my referral link - Wei Huang

As a Medium member, a portion of your membership fee goes to writers you read, and you get full access to every story…

Written by Wei Huang