i getting following error in cqlsh. copy command runs few seconds , stops.
look forward help.
thanks,
connected drm @ 127.0.0.1:9042. [cqlsh 5.0.1 | cassandra 2.1.8 | cql spec 3.2.0 | native protocol v3] use help. cqlsh> use myworld; cqlsh:myworld> copy citizens (id, first_name, last_name, house_no, street, city, country,ssn,phone,bank_name,account_no) '/home/rashmi/documents/mydata/road/peopledata-18-jun-1.txt'; processed 110000 rows; write: 47913.28 rows/s connection heartbeat failure aborting import @ record #1196. inserted records still present, , records after may present well. i have 3 nodes setup. 192.168.1.10, 11 , 12. 11 being seed.
create keyspace myworld replication = { 'class' : 'simplestrategy', 'replication_factor' : 1} create columnfamily citizens (id uuid, first_name varchar, last_name varchar, house_no varchar, street varchar, city varchar, country varchar, ssn varchar, phone varchar, bank_name varchar, account_no varchar, primary key ((country,city),ssn)); following cassandra.yaml
cluster_name: 'drm' (##)initial_token: 0 seeds: "192.168.1.11" listen_address: 192.168.1.11 endpoint_snitch: gossipingpropertyfilesnitch
some update own question, if helps anyone.
environment
my setup based on cassandra 2.2 ubuntu 14 on 3 laptops
- i7 mq 4700/16gigs/1tb drive
- i7 mq 4710/16 gigs/1tb drive
- i7 670/4 gig/500gb drive (old machine)
keyspace replication factor of 3. java heap of 8gb on first 2 machines max heap 400 megs.
was using wireless network via internet router.
objective
import multiple of 70 gig csv files containing 330+ million dummy financial transactions.
issue
heartbeat failure in between. sometime after importing few million rows, after 230 million.
findings
with wireless, ping router , other node in tune of 200+ ms. connected nodes cat 5e , cat 6 cables. reduced ping < .3 ms.
stopped performing additional heavy disk oriented tasks copying 70+ gig files in meanwhile, , querying heavy cqlsh commands select, querying disk space , 10k data files.
data ingestion regulated 9k rows per second, using of disk.
- third node had disk issues, went down in between. large number of hints.
present
import 700+ million rows each day, using 1 machine @ time. second simultaneous import process brings heartbeat error.
next
looking ways improve ingestion twice current rate without hardware changes.
thanks,
Comments
Post a Comment