https://store-images.s-microsoft.com/image/apps.49917.12a713f6-0320-4301-8137-1792f5a561e9.9090fb94-dfd4-45e1-ac51-316a85977850.1e80a1a4-58a3-47a6-84f2-4bcc74597aa3

Apache Spark and TensorFlow on Centos Stream 9 with finance-related Python packages

Apps4Rent LLC

Apache Spark and TensorFlow on Centos Stream 9 with finance-related Python packages

Apps4Rent LLC

Apache Spark, TensorFlow and finance-related Python packages on Centos Stream 9 are open source tools

This product includes Apache Spark, TensorFlow along with the open-source finance libraries. Apache Spark is a multi-language engine for executing data engineering, data science, and machine learning on single-node machines or clusters. Apache Spark is built on an advanced distributed SQL engine for large-scale data

Key features of Apache Spark:

  • Apache spark can be used to perform batch processing. Batch/streaming data.
  • SQL analytics: Execute fast, distributed ANSI SQL queries for dashboarding and ad-hoc reporting. Runs faster than most data warehouses.
  • Machine learning

TensorFlow is an open source platform that lets you create production-grade machine learning models with pre-trained models or your own custom ones.

Features of Tensorflow:

  • Multiple APIs: Offers different APIs like Keras and eager execution for ease of use and flexibility.
  • End-to-end platform: Covers the entire machine learning workflow, from data preprocessing and model training to deployment and serving.
  • Powerful ecosystem: Has a large and active community, extensive documentation, and numerous libraries for specific tasks.

To check the installation of the Apache Spark perform the steps below:

  • 1. Update the VM with: sudo yum update. After updating, run sudo su and then cd /root.
  • 2. Edit the bashrc configuration file to add Apache Spark installation directory to the system path as below: Run, sudo nano ~/.bashrc Add the code below at the end of the file, save and exit the file: export SPARK_HOME=/opt/spark export PATH=$PATH:$SPARK_HOME/bin:$SPARK_HOME/sbin Save the changes to take effect. Now run, source ~/.bashrc 3. Run, sudo systemctl start httpd 4. Run, sudo firewall-cmd --reload 5. Start the standalone master server by running: start-master.sh
  • 6. To view the Spark Web user interface, open a web browser and enter the public IP address of your instance on port 8080 i.e.: http://your_public_ip_adress:8080/

    To check the installation of the TensorFlow and finance libraries installed, perform the steps below:

  • 1. sudo su
  • 2. cd /root
  • 3. Now, run the mentioned command to activate the Python environment: source my-env/bin/activate
  • 4. Run the below commands at a time to check the version of TensorFlow and other finance libraries installed: i) pip show tensorflow ii) pip show pyfinance iii) pip show pyspark iv) pip show pyfolio v) pip show pandas-datareader
  • 5. To deactivate the environment, run: deactivate

Disclaimer: Apps4Rent does not offer commercial licenses of any of the products mentioned above. The products come with open-source licenses. 

Default ports:

  • SSH: 22
  • HTTP: 80
  • HTTPS: 443
  • Apache Spark: 8080