In data science and machine learning, both Python and R are popular programming languages. Python is known for its simplicity and variety of libraries, while R is mostly used in statistical analysis and data visualization. Sometimes, it helps to use both Python and R in the same project to take advantage of what each language does best. Integrating Python code with R is easy, making it simple for the two to work together.
Why Integrate Python with R?
It provides several advantages such as:
- By combining Python and R, you can use Python's extensive libraries, such as TensorFlow and Pandas, alongside R's powerful statistical functions. This allows you to select the most effective tool for each task, improving efficiency and performance.
- Integration reduces the need to switch between different tools or manually transfer data, creating a more cohesive workflow and minimizing errors.
- Effective integration supports maintaining a single environment for both languages, simplifying code management and setup.
- Teams with diverse programming preferences can work together more effectively. Integration allows team members to use their preferred language while contributing to a common project, enhancing collaboration.
- Combining the strengths of Python and R provides a more comprehensive analytical approach. For example, you can leverage R for statistical analysis and Python for machine learning within the same project.
- Integration tools facilitate the sharing of code, results, and insights between team members, promoting effective collaboration and knowledge exchange.
How to Integrate Python with R
There are two primary ways which are:
- Using the
reticulatepackage in R - Using Jupyter Notebooks with both R and Python kernels
Now we implement step by step those two methods in R programming Language.
Method 1: Using the reticulate Package in R
The {reticulate} package in R makes it easy to embed Python code within R Markdown documents, allowing us to take advantage of both languages in a single document.
Step 1: Install and Load the reticulate Package
First, install the reticulate package.
# Install reticulate package (only if not installed yet)
install.packages("reticulate")
# Load reticulate package
library(reticulate)
Step 2: Configure Python
Ensure that Python is properly configured.
py_config()
Output:
python: C:/Python312/python.exe
libpython: C:/Python312/python312.dll
pythonhome: C:/Python312
version: 3.12.2 (tags/v3.12.2:6abddd9, Feb 6 2024, 21:26:36) [MSC v.1937 64 bit (AMD64)]
Architecture: 64bit
numpy: C:/Users/Tonmoy/AppData/Roaming/Python/Python312/site-packages/numpy
numpy_version: 1.26.4
NOTE: Python version was forced by PATH
python versions found:
C:/Python312/python.exe
C:/Users/Tonmoy/AppData/Local/Programs/Python/Python311/python.exe
C:/Users/Tonmoy/anaconda3/python.exe
Step 3: Run Python Code in R
Now, you can run Python code directly within your R script.
# Run Python code
py_run_string("x = 10")
py_run_string("y = 5")
py_run_string("z = x + y")
# Access the Python variable in R
z <- py$z # Correctly access the Python variable 'z' in R
print(z)
Output:
[1] 15Step 4: Use Python Libraries in R
Now import and use Python libraries like numpy in R.
# Import numpy library from Python
np <- import("numpy")
# Create a numpy array in R
arr <- np$array(c(1, 2, 3, 4))
# Print the numpy array in R
print(arr)
Output:
[1] 1 2 3 4Step 5: Calling a Python Script
If there's a Python script named (script.py), it can be executed directly from R. In this example, we'll use a script that checks for Armstrong numbers, saved as am.py.
source_python("am.py")
Output:
True
False
Method 2: Using Jupyter Notebooks with both R and Python kernels
Step 1: Install Jupyter Notebooks
If Jupyter is not installed, you can install it using Python.
pip install jupyterStep 2: Install the R Kernel for Jupyter
To run R code in Jupyter, you need to install the R kernel.
# Install the IRkernel package
install.packages("IRkernel")
# Register the kernel with Jupyter
IRkernel::installspec()
Step 3: Create a New Jupyter Notebook
Open Jupyter by running the following command in your terminal or command prompt.
jupyter notebookStep 4: Running Python and R Code Together
Now we will Run the Python and R Code Together.
Python Cell
# Python code
x = 5
y = 10
z = x + y
z
Output:
15R Cell
# R code
result <- z + 5 # Access the Python variable 'z' in R
print(result)
Output:
[1] 20Step 5: Switching Between Python and R
Within the same notebook, we can switch between Python and R using the magic commands %%python and %%R.
Switching to R in a Python Cell:
%%R
# R code in a Python cell
R_var <- 20
R_var
Output:
[1] 20Switching to Python in an R Cell:
%%python
# Python code in an R cell
python_var = 10
python_var
Output:
10Conclusion
Integrating Python code with R allows for the combination of the unique strengths of both languages, enhancing data analysis capabilities. Whether using the reticulate package in R or Jupyter Notebooks with both R and Python kernels, the process is straightforward and opens up a range of possibilities for projects. These tools enable the creation of more efficient, flexible, and collaborative workflows, leading to improved outcomes in data science tasks.