Title:  'Rcpp' Bindings for 'Annoy', a Library for Approximate Nearest Neighbors 

Description:  'Annoy' is a small C++ library for Approximate Nearest Neighbors written for efficient memory usage as well an ability to load from / save to disk. This package provides an R interface by relying on the 'Rcpp' package, exposing the same interface as the original Python wrapper to 'Annoy'. See <https://github.com/spotify/annoy> for more on 'Annoy'. 'Annoy' is released under Version 2.0 of the Apache License. Also included is a small Windows port of 'mmap' which is released under the MIT license. 
Authors:  Dirk Eddelbuettel 
Maintainer:  Dirk Eddelbuettel <[email protected]> 
License:  GPL (>= 2) 
Version:  0.0.22 
Built:  20240620 05:15:41 UTC 
Source:  https://github.com/eddelbuettel/rcppannoy 
Annoy is a small library written to provide fast and memoryefficient nearest neigbor lookup from a possibly static index which can be shared across processes.
Details about Annoy are available at the reference listed below.
Dirk Eddelbuettel for the R interface; Erik Bernhardsson for Annoy itself.
https://github.com/spotify/annoy
a < new(AnnoyEuclidean, vectorsz) a$setSeed(0) a$setVerbose(0) a$addItem(i, dv) a$getNItems() a$getItemsVector(i) a$getDistance(i, j) a$build(n_trees) a$getNNsByItem(i, n) a$getNNsByItemList(i, n, search_k, include_distances) a$getNNsByVector(v, n) a$getNNsByVectorList(v, n, search_k, include_distances) a$save(fn) a$load(fn) a$unload()
new(Class, vectorsz)
Create a new Annoy instance of type Class
where Class
is on of the following:
AnnoyEuclidean
,
AnnoyAngular
,
AnnoyManhattan
,
AnnoyHamming
.
vectorsz
denotes the length of the vectors that the Annoy instance
will be indexing.
$addItem(i, v)
Adds item i
(any nonnegative integer) with vector v
.
Note that it will allocate memory for max(i) + 1
items.
$build(n_trees)
Builds a forest of n_trees
trees.
More trees gives higher precision when querying.
After calling build
, no more items can be added.
$save(fn)
Saves the index to disk as filename fn
.
After saving, no more items can be added.
$load(fn)
Loads (mmaps) an index from filename fn
on disk.
$unload()
Unloads index.
$getDistance(i, j)
Returns the distance between items i
and j
$getNNsByItem(i, n)
Returns the n
closest items as an integer vector of indices.
$getNNsByVector(v, n)
Same as $getNNsByItem
, but queries by vector v
rather than
index i
.
$getNNsByItemList(i, n, search_k = 1, include_distances = FALSE)
Returns the n closest items to item i
as a list.
During the query it will inspect up to search_k
nodes which
defaults to n_trees * n
if not provided.
search_k
gives you a runtime tradeoff between better accuracy and
speed.
If you set include_distances
to TRUE
,
it will return a length 2 list with elements "item"
&
"distance"
.
The "item"
element contains the n
closest items as an integer
vector of indices.
The optional "distance"
element contains the corresponding distances
to "item"
as a numeric vector.
$getNNsByVectorList(i, n, search_k = 1, include_distances = FALSE)
Same as $getNNsByItemList
, but queries by vector v
rather than
index i
$getItemsVector(i)
Returns the vector for item i
that was previously added.
$getNItems()
Returns the number of items in the index.
$setVerbose()
If 1
then messages will be printed during processing.
If 0
then messages will be suppressed during processing.
$setSeed()
Set random seed for annoy (integer).
library(RcppAnnoy) # BUILDING ANNOY INDEX  vector_size < 10 a < new(AnnoyEuclidean, vector_size) a$setSeed(42) # Turn on verbose status messages (0 to turn off) a$setVerbose(1) # Load 100 random vectors into index for (i in 1:100) a$addItem(i  1, runif(vector_size)) # Annoy uses zero indexing # Display number of items in index a$getNItems() # Retrieve item at postition 0 in index a$getItemsVector(0) # Calculate distance between items at postitions 0 & 1 in index a$getDistance(0, 1) # Build forest with 50 trees a$build(50) # PERFORMING ANNOY SEARCH  # Retrieve 5 nearest neighbors to item 0 # Returned as integer vector of indices a$getNNsByItem(0, 5) # Retrieve 5 nearest neighbors to item 0 # search_k = 1 will invoke default search_k value of n_trees * n # Return results as list with an element for distance a$getNNsByItemList(0, 5, 1, TRUE) # Retrieve 5 nearest neighbors to item 0 # search_k = 1 will invoke default search_k value of n_trees * n # Return results as list without an element for distance a$getNNsByItemList(0, 5, 1, FALSE) v < runif(vector_size) # Retrieve 5 nearest neighbors to vector v # Returned as integer vector of indices a$getNNsByVector(v, 5) # Retrieve 5 nearest neighbors to vector v # search_k = 1 will invoke default search_k value of n_trees * n # Return results as list with an element for distance a$getNNsByVectorList(v, 5, 1, TRUE) # Retrieve 5 nearest neighbors to vector v # search_k = 1 will invoke default search_k value of n_trees * n # Return results as list with an element for distance a$getNNsByVectorList(v, 5, 1, TRUE) # SAVING/LOADING ANNOY INDEX  # Create a tempfile, replace with a local file to keep treefile < tempfile(pattern="annoy", fileext="tree") # Save annoy tree to disk a$save(treefile) # Load annoy tree from disk a$load(treefile) # Unload index from memory a$unload()
Get the version of the Annoy C++ library that RcppAnnoy was compiled with.
getAnnoyVersion(compact = FALSE)
compact 
Logical scalar indicating whether a compact

An integer vector containing the major, minor and patch version numbers;
or if compact=TRUE
, a package_version
object.
Aaron Lun
Report CPU Architecture and Compiler
getArchictectureStatus()
A constant direct created at compiletime describing the extent of AVX instructions (512 bit, 128 bit, or none) and compiler use where currently recognised are MSC (unlikely for R), GCC, Clang, or ‘other’.