Data Loading
Load large amounts of data into cache.
Data loading usually has to do with initializing cache data on startup. Using standard cache Put(...) or PutAll(...) operations is generally inefficient for loading large amounts of data.
IDataStreamer
Data streamers are defined by IDataStreamer API and are built to inject large amounts of continuous data into Ignite caches. Data streamers are built in a scalable and fault-tolerant fashion and achieve high performance by batching entries together before they are sent to the corresponding cluster members.
Data streamers should be used to load large amount of data into caches at any time, including pre-loading on startup.
See Data Streamers documentation for more information.
ICache.LoadCache()
Another way to load large amounts of data into cache is through ICacheStore.LoadCache() method, which allows for cache data loading even without passing all the keys that need to be loaded.
ICache.LoadCache() method will delegate to ICacheStore.LoadCache() method on every cluster member that is running the cache. To invoke loading only on the local cluster node, use ICache.LocalLoadCache() method.
In case of partitioned caches, keys that are not mapped to this node, either as primary or backups, will be automatically discarded by the cache.
Partition-aware data loading
In the scenario described above the same query will be executed on all the nodes. Each node will iterate over the whole result set, skipping the keys that do not belong to the node, which is not very efficient.
The situation may be improved if partition ID is stored alongside with each record in the database. You can use ICacheAffinity interface to get partition ID for any key being stored into a cache.
When cache objects become partition-ID aware, each node can query only those partitions that belong to the node. In order to do that, you can inject an instance of Ignite into your cache store and use it to determine partitions that belong to the local node.
Updated over 6 years ago
