Popularity

3.6

Growing

Activity

0.0

Stable

Stars 31

Watchers 1

Forks 3

Last Commit over 2 years ago

Description

Nif wrapper for the xor_filter: https://github.com/FastFilter/xor_singleheader

They're 'Faster and Smaller Than Bloom and Cuckoo Filters'.

Benchmark are included in the repo's README, is 2x-12x faster than some bloom filter libraries.

Programming language: C

License: Apache License 2.0

Tags: Algorithms And Data Structures Erlang Data Structures

Latest version: v0.6.0

exor_filter alternatives and similar packages

Based on the "Algorithms and Data structures" category.
Alternatively, view exor_filter alternatives based on common mentions on social networks and blogs.

flow

9.6 3.4 exor_filter VS flow

Computational parallel flows on top of GenStage
witchcraft

9.5 0.0 exor_filter VS witchcraft

Monads and other dark magic for Elixir

InfluxDB - Power Real-Time Data Analytics at Scale

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

Promo www.influxdata.com

fuse

8.9 0.0 exor_filter VS fuse

A Circuit Breaker for Erlang
matrex

8.7 0.0 exor_filter VS matrex

A blazing fast matrix library for Elixir/Erlang with C implementation using CBLAS.
simple_bayes

8.5 0.0 exor_filter VS simple_bayes

A Naive Bayes machine learning implementation in Elixir.
fsm

8.3 0.0 exor_filter VS fsm

Finite State Machine data structure
exconstructor

8.2 5.4 exor_filter VS exconstructor

An Elixir library for generating struct constructors that handle external data with ease.
erlang-algorithms

8.1 0.0 exor_filter VS erlang-algorithms

Implementations of popular data structures and algorithms
monadex

8.0 0.0 exor_filter VS monadex

Upgrade your pipelines with monads.
datastructures

7.7 0.0 exor_filter VS datastructures

Datastructures for Elixir.
loom

7.7 0.0 exor_filter VS loom

A CRDT library with δ-CRDT support.
monad

7.5 0.0 exor_filter VS monad

DISCONTINUED. Monads and do-syntax for Elixir
trie

7.4 3.3 exor_filter VS trie

Erlang Trie Implementation
aja

7.1 6.8 exor_filter VS aja

Extension of the Elixir standard library focused on data stuctures, data manipulation and performance
remodel

7.0 0.0 exor_filter VS remodel

:necktie: An Elixir presenter package used to transform map structures. "ActiveModel::Serializer for Elixir"
lz4

7.0 0.0 L1 exor_filter VS lz4

LZ4 bindings for Erlang
MapDiff

6.7 0.0 exor_filter VS MapDiff

Calculates the difference between two (nested) maps, and returns a map representing the patch of changes.
parallel_stream

6.6 0.0 exor_filter VS parallel_stream

A parallelized stream implementation for Elixir
merkle_tree

6.4 0.0 exor_filter VS merkle_tree

:evergreen_tree: Merkle Tree implementation in pure Elixir
bloomex

6.4 0.0 exor_filter VS bloomex

DISCONTINUED. :hibiscus: A pure Elixir implementation of Scalable Bloom Filters
sfmt

6.4 4.4 exor_filter VS sfmt

DISCONTINUED. sfmt-erlang: SIMD-oriented Fast Mersenne Twister (SFMT) for Erlang
Exads

6.3 0.0 exor_filter VS Exads

Algorithms and Data Structures collection in Elixir
graphmath

6.3 3.1 exor_filter VS graphmath

An Elixir library for performing 2D and 3D mathematics.
DeepMerge

6.0 6.3 exor_filter VS DeepMerge

Deep (recursive) merge for maps, keywords and others in Elixir
the_fuzz

6.0 0.0 exor_filter VS the_fuzz

String metrics and phonetic algorithms for Elixir (e.g. Dice/Sorensen, Hamming, Jaccard, Jaro, Jaro-Winkler, Levenshtein, Metaphone, N-Gram, NYSIIS, Overlap, Ratcliff/Obershelp, Refined NYSIIS, Refined Soundex, Soundex, Weighted Levenshtein)
exmatrix

5.7 0.0 exor_filter VS exmatrix

Elixir library implementing a parallel matrix multiplication algorithm and other utilities for working with matrices. Used for benchmarking computationally intensive concurrent code.
ecto_materialized_path

5.6 0.0 exor_filter VS ecto_materialized_path

Tree structure & hierarchy for ecto models
dataframe

5.5 0.0 exor_filter VS dataframe

Package providing functionality similar to Python's Pandas or R's data.frame()
sleeplocks

5.2 0.0 exor_filter VS sleeplocks

BEAM friendly spinlocks for Elixir/Erlang
blocking_queue

5.2 3.0 exor_filter VS blocking_queue

A blocking queue written in Elixir.
parex

5.0 0.0 exor_filter VS parex

An elixir module for parallel execution of functions/processes
cuid

5.0 0.0 exor_filter VS cuid

Collision-resistant ids, in Elixir
red_black_tree

5.0 0.0 exor_filter VS red_black_tree

Red-black tree implementation for Elixir.
ratio

4.9 5.2 exor_filter VS ratio

Rational number library for Elixir.
Conrex

4.7 0.0 exor_filter VS Conrex

An Elixir implementation of the CONREC algorithm for topographic or isochrone maps.
hash_ring_ex

4.7 0.0 exor_filter VS hash_ring_ex

A consistent hash ring implemention for Elixir
simhash

4.6 0.0 exor_filter VS simhash

Elixir implementation of Simhash
array

4.4 0.0 exor_filter VS array

An Elixir wrapper library for Erlang's array
murmur

4.4 0.0 exor_filter VS murmur

DISCONTINUED. :speech_balloon: An implementation of the non-cryptographic hash Murmur3
bitmap

4.3 0.0 exor_filter VS bitmap

Bitmap implementation in Elixir using binaries and integers. Fast space efficient data structure for lookups
aruspex

4.2 0.0 exor_filter VS aruspex

A configurable constraint solver
memoize

4.2 0.0 exor_filter VS memoize

DefMemo - Ryuk's little puppy! Bring apples.
eastar

4.1 2.8 exor_filter VS eastar

A* graph pathfinding in pure Elixir
cuckoo

4.1 0.0 exor_filter VS cuckoo

DISCONTINUED. :bird: Cuckoo Filters in Elixir
gen_fsm

4.1 0.0 exor_filter VS gen_fsm

Elixir wrapper around OTP's gen_fsm
qex

4.1 5.4 exor_filter VS qex

Queue data structure for Elixir-lang
Closure Table

4.1 4.9 exor_filter VS Closure Table

Closure Table for Elixir - a simple solution for storing and manipulating complex hierarchies.
luhn

3.8 0.0 exor_filter VS luhn

Luhn algorithm in Elixir
sorted_set

3.7 0.0 exor_filter VS sorted_set

Sorted Set library for Elixir
combination

3.7 0.0 exor_filter VS combination

A simple combinatorics library providing combination and permutation.

* Code Quality Rankings and insights are calculated and provided by Lumnify.
They vary from L1 to L5 with "L5" being the highest.

Do you think we are missing an alternative of exor_filter or a related project?

Add another 'Algorithms and Data structures' Package

Popular Comparisons

README

exor_filter

Nif wrapper for the xor_filter: https://github.com/FastFilter/xor_singleheader

They're 'Faster and Smaller Than Bloom and Cuckoo Filters'.

This library uses dirty nifs for initializing filters over 10K elements! Make sure your environment is setup correctly. Filters of 10M elements can be initialized within 4 seconds. Within 2.5 seconds if the library is used unsafely.

Benchmarks
Installation
Example Usage
- Basic Usage
- Incremental Initialization
Hashing
- Hashing Example
- Hashing API
  - [Pre-Hashing and Custom Hashing](pre-hashing-and-custom-hashing)
Elixir Example
Custom Return Values
Serialization
xor16
Buffered Initialization

Benchmarks

The exor_benchmark repo was used to compare access times to popular bloom filter libraries.

Benchmark Graph

Installation

Available on hex.pm!.

For rebar3:

%% rebar.config

{deps, [
  {exor_filter, "0.7.1"}
]}.

For Mix:

## mix.exs

defp deps do
  [
    {:exor_filter, "~> 0.7.1"}
  ]
end

Note, if you're using Erlang below version 23, then use this version of this library: v0.5.2. Otherwise, use the latest version.

Example Usage

Basic Usage

Basic usage with default hashing is as follows:

Filter = xor8:new(["cat", "dog", "mouse"]),
true   = xor8:contain(Filter, "cat"),
false  = xor8:contain(Filter, "goose").

Filters are initialized independently:

Filter1 = xor8:new([1, 2, 3]),
Filter2 = xor8:new([4, 5, 6]),

false   = xor8:contain(Filter1, 6),
true    = xor8:contain(Filter1, 2),

false   = xor8:contain(Filter2, 2),
true    = xor8:contain(Filter2, 5).

Incremental Initialization

This is now the preferred method of usage. To create a filter incrementally, the following API should be used. It is more memory efficient than providing the entire list at initialization time. Only the default hashing method is supported. See the hashing section for more details. This method will automatically deduplicate the input safely. WARNING: Currently, the incremental API does not use dirty nifs for large input sizes. Be cautious of this, initialization can block.

Filter0 = xor8:new_empty(),            %% new_empty/0 defaults to 64 elements.  Either function
                                       %% will dynamically allocate more space as 
                                       %% needed while elements are added.
Filter1 = xor8:add(Filter0, [1, 2]),
Filter2 = xor8:add(Filter1, [3, 4]),   %% More space allocated here.
Filter3 = xor8:finalize(Filter3),      %% finalize/1 MUST be called to actually intialize the filter.
true    = xor8:contain(Filter3, 1),
false   = xor8:contain(Filter3, 6).

Do not modify the return value of any of the functions. The other APIs will not function correctly.

Hashing

The function xor8:new/1 uses the default hash algorithm.
- See erlang:phash2/1.
To specify the hashing algorithm to use, use the xor8:new/2 function.
The filter initialization functions return values contain the context of hashing, so there is no need to specify it in the xor8:contain/2 function.
- Do not pre-hash the value being passed to xor8:contain/2 or /3. Pass the raw value!
- (Unless you've explicitly set that you're using pre-hashed data. See below).
The default hashing mechanisms remove duplicate keys. Pre-hashed data should be checked by the user. The libary will return an error on initialization if dupes are detected. ### Hashing Example erlang Filter = xor8:new([1, 2, 3], none), true = xor8:contain(Filter, 1), false = xor8:contain(Filter, 6).

Hashing API

The default hash function used is erlang:phash2/1
- It can be specified with the default_hash as the second argument to xor8:new/2.
- It uses 60 bits on a 64-bit system and is consistent across nodes.
- The default hashing function should be fine for most use cases, but if the filter has over 20K elements, create your own hashing function, as hashing collisions will become more frequent.
  - Errors won't happen if a collision occurs.

Pre-Hashing and Custom Hashing

There is an option to pass a hash function during intialization.
It must return a unsigned 64 bit number and have an airty of /1.
Due to the Erlang nif api lacking the functionality to pass and call a function in a nif, this method creates a second list of equal length. Be weary of that.
The custom hashing function must return unique keys.
- An error will be returned otherwise.
- Make your unit testing reflect reality, if possible. This will catch the issue early. erlang Fun = fun(X) -> X + 1 end, Filter = xor8:new([1, 2, 3], Fun), true = xor8:contain(Filter, 4), false = xor8:contain(Filter, 1).
To pass pre-hashed data, use the hash option none. The xor8:contain/2 and /3 functions must be passed pre-hashed data in this case.
- This too will check for duplicate hashed values, and will return an error if it is detected.

Elixir Example

# ...
alias :xor8, as: Xor8
# ...
true =
   [1, 2, 3, 4]
   |> Xor8.new()
   |> Xor8.contain(1)

Custom Return Values

contain/3 can return a custom value instead of false if the value isn't present in the filter:

Filter1            = xor8:new(["Ricky Bobby", "Cal Naughton Jr."]),
true               = xor8:contain(Filter1, "Ricky Bobby", {error, not_found}),
{error, not_found} = xor8:contain(Filter1, "Reese Bobby", {error, not_found}).

Serialization

Functions are provided to the filter in binary form, instead of a nif reference. This can be useful to interop with other platforms / systems. The bin returned can be used with contain for ease of use. Example usage:

Filter                        = xor8:new(["test1", "test2", "test3"]),
BinFilter                     = xor8:to_bin(Filter),
{XorFilterBin, _HashFunction} = BinFilter,
true                          = xor8:contain(BinFilter, "test1").

xor16

The usage of the xor16 is the same. That structure is larger, but has a smaller false positive rate. Just sub xor8 for xor16 in all of the examples.

Buffered Initialization

The buffered versions of initialize are provided for larger data sets. This can be faster. See xor8:new_buffered/2 for more information.

You didn't hear it from me, though ;)

Build

$ rebar3 compile

Test

$ rebar3 eunit
$ rebar3 cover

Docs

$ rebar3 edoc

Implementations of xor filters in other languages

Go
Rust: 1 and 2
C++
Java
C

exor_filter

Erlang nif for xor_filter. 'Faster and Smaller Than Bloom and Cuckoo Filters'.

Description

exor_filter alternatives and similar packages

Popular Comparisons

README

exor_filter

Table of Contents

Benchmarks

Installation

Example Usage

Basic Usage

Incremental Initialization

Hashing

Hashing API

Pre-Hashing and Custom Hashing

Elixir Example

Custom Return Values

Serialization

xor16

Buffered Initialization

Build

Test

Docs

Implementations of xor filters in other languages