Expand description
Cache data outside memory, loading in when referenced.
You may want a more standard data storage solution! See
alternatives
to make sure another approach
doesn’t fit your case better.
This crate uses some unsafe
code on certain features (not the default
features). See unsafe_usage
for the listing and explanations.
Dependencies are listed and explained in deps
.
features
runs down selecting which of the (many) features to use.
§Motivating Example
More examples that demonstrate more complex uses are in examples
.
Assume that you have some LargeStruct
that takes up significant storage and
can be reduced to a smaller representation for searching. If it is stored in a
collection where only a small number of elements are used, keeping them all
loaded in memory wastes system resources. A
vector store is one such
structure. This example assumes a large, hashable type with only a few entries
accessed.
#[cfg(all(feature = "bincode", feature = "array"))] {
use std::{
env::temp_dir,
iter::from_fn,
};
use serde::{Serialize, Deserialize};
use backed_data::{
entry::{
disks::Plainfile,
formats::BincodeCoder,
},
array::VecBackedArray,
};
#[derive(Debug, PartialEq, Eq, Serialize, Deserialize)]
struct LargeStruct {
val: u8,
...
};
impl LargeStruct {
fn new_random() -> Self {
...
}
}
const NUM_BACKINGS: usize = 1_000;
const ELEMENTS_PER_BACKING: usize = 1_000;
// This application only needs to find three random elements.
let query = [0, 10_000, 50_000];
// Do not use a temporary directory in real code.
// Use some location that actually guarantees memory leaves RAM.
let backing_dir = temp_dir().join(BACKING_PATH);
std::fs::create_dir_all(&backing_dir).unwrap();
// Define a backed array using Vec.
let mut backing = VecBackedArray
::<LargeStruct, Plainfile, BincodeCoder<_>>::
new();
// Build the indices and backing store in 1,000 item chunks.
for _ in 0..NUM_BACKINGS {
let chunk_data: Vec<_> = from_fn(|| Some(LargeStruct::new_random()))
.take(ELEMENTS_PER_BACKING)
.collect();
// This is handled automatically by `DirectoryBackedArray` types.
let target_file = backing_dir.join(uuid::Uuid::new_v4().to_string()).into();
// Add a new bincode-encoded file that stores 1,000 elements.
// After this operation, the elements are on disk only (chunk_data
// is dropped by scope rules).
backing.append(chunk_data, target_file, BincodeCoder::default()).unwrap();
}
// Query for three elements. At most 3,000 elements are loaded, because
// the data is split into 1,000 element chunks. Only 2,997 useless
// elements are kept in memory, instead of 99,997.
let results: Vec<_> = query
.iter()
.map(|q| backing.get(*q))
.collect();
}
§Usage
The core structure is entry::BackedEntry
, which is wrapped by array
and directory
. It should be pointed at external data to load when used.
That data will remain in memory until unloaded (so subsequent reads avoid
the cost of decoding). Try to only unload in one of the following scenarios:
- The data will not be read again.
- The program’s heap footprint needs to shrink.
- The external store was modified by another process.
Each entry needs a format and (potentially layered)
disks to use. The array
wrapper also needs choices of
containers to hold its array of keys and array of backed
entries.
§Licensing and Contributing
All code is licensed under MPL 2.0. See the FAQ
for license questions. The license non-viral copyleft and does not block this library from
being used in closed-source codebases. If you are using this library for a commercial purpose,
consider reaching out to dansecob.dev@gmail.com
to make a financial contribution.
Contributions are welcome at https://github.com/Bennett-Petzold/backed_data. Please open an issue or PR if:
- Some dependency is extraneous, unsafe, or has a versioning issue.
- Any unsafe code is insufficiently explained or tested.
- There is any other issue or missing feature.
Re-exports§
pub use entry::BackedEntryArr;
pub use entry::BackedEntryArrLock;
pub use entry::BackedEntryCell;
pub use entry::BackedEntryLock;
pub use entry::BackedEntryAsync;
async
pub use array::VecBackedArray;
array
pub use directory::StdDirBackedArray;
directory
pub use directory::ZstdDirBackedArray;
directory
andruntime
andzstd
pub use directory::AsyncZstdDirBackedArray;
directory
andruntime
andasync_zstd
Modules§
- array
array
- Defines
BackedArray
and theContainer
/ResizingContainer
traits it uses. - directory
directory
- Defines
DirectoryBackedArray
. - entry
- Defines
BackedEntry
, the core of this library. - examples
- Example usage re-exports.
- extra_
docs - Additional description of the library.
- test_
utils test
- Defines tools used for ONLY testing.
- utils
- Backbone traits and structs for the library implementation.
Macros§
- cursor_
vec test
- Creates a default
CursorVec
for testing.