i want basic, fire spark cluster through emr console , run spark script depends on python package (for example, arrow). straightforward way of doing this?
the straightforward way create bash script containing installation commands, copy s3, , set bootstrap action console point script.
here's example i'm using in production:
s3://mybucket/bootstrap/install_python_modules.sh
#!/bin/bash -xe # non-standard , non-amazon machine image python modules: sudo pip install -u \ awscli \ boto \ ciso8601 \ ujson \ workalendar sudo yum install -y python-psycopg2
Comments
Post a Comment