ubuntu 14.04 - Cassandra Copy command-Connection heartbeat failure -

i getting following error in cqlsh. copy command runs few seconds , stops.

look forward help.

thanks,

connected drm @ 127.0.0.1:9042. [cqlsh 5.0.1 | cassandra 2.1.8 | cql spec 3.2.0 | native protocol v3] use help. cqlsh> use myworld; cqlsh:myworld> copy citizens (id, first_name, last_name, house_no, street, city, country,ssn,phone,bank_name,account_no) '/home/rashmi/documents/mydata/road/peopledata-18-jun-1.txt'; processed 110000 rows; write: 47913.28 rows/s connection heartbeat failure aborting import @ record #1196. inserted records still present, , records after may present well.

i have 3 nodes setup. 192.168.1.10, 11 , 12. 11 being seed.

create keyspace myworld replication =  { 'class' : 'simplestrategy', 'replication_factor' : 1}  create columnfamily citizens (id uuid, first_name varchar, last_name varchar,  house_no varchar, street varchar, city varchar, country varchar, ssn varchar, phone varchar, bank_name varchar, account_no varchar, primary key ((country,city),ssn));

following cassandra.yaml

cluster_name: 'drm'  (##)initial_token: 0 seeds: "192.168.1.11" listen_address: 192.168.1.11 endpoint_snitch: gossipingpropertyfilesnitch

some update own question, if helps anyone.

environment

my setup based on cassandra 2.2 ubuntu 14 on 3 laptops

i7 mq 4700/16gigs/1tb drive
i7 mq 4710/16 gigs/1tb drive
i7 670/4 gig/500gb drive (old machine)

keyspace replication factor of 3. java heap of 8gb on first 2 machines max heap 400 megs.

was using wireless network via internet router.

objective

import multiple of 70 gig csv files containing 330+ million dummy financial transactions.

issue

heartbeat failure in between. sometime after importing few million rows, after 230 million.

findings

with wireless, ping router , other node in tune of 200+ ms. connected nodes cat 5e , cat 6 cables. reduced ping < .3 ms.
stopped performing additional heavy disk oriented tasks copying 70+ gig files in meanwhile, , querying heavy cqlsh commands select, querying disk space , 10k data files.

data ingestion regulated 9k rows per second, using of disk.

third node had disk issues, went down in between. large number of hints.

present

import 700+ million rows each day, using 1 machine @ time. second simultaneous import process brings heartbeat error.

looking ways improve ingestion twice current rate without hardware changes.

thanks,

Brazille

Search This Blog