In this post, I will show you how to import some data into your cluster. There is a very cool program called CDM written by the very talented Jon Haddad. This was written for importing test data projects for single node localhost installation of Cassandra, not what we have 🙂
In our case, we will create the schema and import the data manually.
Startup both C* nodes and start the C* process. If you have followed previous posts all you should need to do it type “cassandra” at the command prompt, do one node at a time.
Run “nodetool status” and you should see both nodes in your cluster.
$ nodetool status
Download the necessary files to your server.
cd /home/cassandra wget http://nosqldiaries.com/wp-content/uploads/2018/08/schema.txt wget http://nosqldiaries.com/wp-content/uploads/2018/08/movies.csv wget http://nosqldiaries.com/wp-content/uploads/2018/08/ratings_by_movie.csv wget http://nosqldiaries.com/wp-content/uploads/2018/08/ratings_by_user.csv wget http://nosqldiaries.com/wp-content/uploads/2018/08/users.csv
Now start cqlsh and create the schema and import the data
3.1 start cqlsh
3.2 create the movielens schema
3.3 check the schema
desc keyspace movielens ;
3.4 Now copy the data from the csv files into our database
COPY movielens.movies FROM 'movies.csv' WITH DELIMITER=','; COPY movielens.ratings_by_movie FROM 'ratings_by_movie.csv' WITH DELIMITER=','; COPY movielens.ratings_by_user FROM 'ratings_by_user.csv' WITH DELIMITER=','; COPY movielens.users FROM 'users.csv' WITH DELIMITER=',';
3.5 Quickly verify the data
select * from movielens.movies limit 10; select * from movielens.ratings_by_movie limit 10; select * from movielens.ratings_by_user limit 10; select * from movielens.users limit 10;
Flush the data to disk.
Calling nodetool flush is needed in order to ensure our memtables have been written to disk. If we didn’t do this, our data would be sitting in memory, and compaction requires data to be written to disk.
We now have a Cassandra database with test application data in it. We can now use this for testing and learning.