R and Python Comparison

Code chunk in R and it’s equivalent in Python that reads spectrum from csv, subtracts baseline, perform area normalization and plot the spectrum in high quality.

In R:

library(tidyverse)
library(ggplot2)
library(readr)
library(purrr)
library(scales)

# Read spectrum data from csv
spectrum <- read_csv("spectrum.csv")

# Subtract baseline
spectrum_baseline <- spectrum %>%
  mutate(baseline = min(intensity)) %>%
  mutate(intensity = intensity - baseline)

# Perform area normalization
spectrum_normalized <- spectrum_baseline %>%
  mutate(area = sum(intensity)) %>%
  mutate(intensity = intensity / area)

# Plot spectrum
ggplot(spectrum_normalized, aes(wavelength, intensity)) +
  geom_line(color = "blue") +
  scale_x_continuous(name = "Wavelength (nm)", limits = c(200, 1000)) +
  scale_y_continuous(name = "Intensity (a.u.)") +
  theme_minimal()

In Python:

import pandas as pd
import matplotlib.pyplot as plt
import numpy as np

# Read spectrum data from csv
spectrum = pd.read_csv("spectrum.csv")

# Subtract baseline
baseline = min(spectrum["intensity"])
spectrum["intensity"] = spectrum["intensity"] - baseline

# Perform area normalization
area = sum(spectrum["intensity"])
spectrum["intensity"] = spectrum["intensity"] / area

# Plot spectrum
plt.plot(spectrum["wavelength"], spectrum["intensity"], color="blue")
plt.xlabel("Wavelength (nm)")
plt.ylabel("Intensity (a.u.)")
plt.xlim([200, 1000])
plt.ylim([0, max(spectrum["intensity"])])
plt.ticklabel_format(style='sci', axis='y', scilimits=(0,0))
plt.minorticks_on()
plt.grid(b=True, which='major', color='gray', linestyle='-')
plt.grid(b=True, which='minor', color='gray', linestyle='--')
plt.show()
Krzysztof Banas
Krzysztof Banas
Principal Research Fellow

I work as beam-line scientist at Singapore Synchrotron Light Source. My research interests include application of advanced statistical methods for hyperspectral data processing (dimension reduction, clustering and identification).

Related