So, the AWS cli command to create the cluster:
myBA='--bootstrap-action Path="s3://< my-s3-bucket >/shell/install_profile.sh"'; aws emr create-cluster --release-label emr-5.0.0 --name testBA-10 --applications {Name="Hive",Name="Spark",Name="Zeppelin",Name="Ganglia"} --ec2-attributes my-Keypair --region us-east-1 --use-default-roles --instance-groups InstanceGroupType=MASTER,InstanceCount=1,InstanceType=m3.xlarge InstanceGroupType=CORE,InstanceCount=1,InstanceType=m3.xlarge ${myBA}The environmental variable myBA is the path to the S3 bucket that stores my script for the BA. It's worth noting that the script should exit with a 0 exit value if you don't want your BA to fail. Here's my script:
#!/usr/bin/env bash
set -x
set -e
HADOOP_HOME=/home/hadoop
# My profile
aws s3 cp s3://< my S3 bucket >/AWSStuff/std/myfuncs ${HADOOP_HOME}
echo "About to modify my .bashrc login"
echo -e "\n. ~/myfuncs\ngetNewProfile\n. .hbw" >> ${HADOOP_HOME}/.bashrc
# Install byobu
sudo /usr/bin/yum -y --enablerepo=epel install byobu
# And we're ready to rock
exit 0
Bingo. We're ready to roll.
 
