Arthur Boffelli Castro

Bioinformatician & PhD Student

Functional Breast Cancer Genomics, Lund University

Investigating the functional impact of splicing factors in breast cancer using genomic and computational approaches.

Let's Connect
Profile Photo

About Me

I am a PhD student in the Functional Breast Cancer Genomics group at Lund University, where I study how splicing factors influence breast cancer biology. My research uses genomic data analysis to uncover new insights into gene regulation and cancer mechanisms.

Alongside my research, I work as a systems administrator for the Bioinformatics Masters Programme at Lund University, maintaining the computational infrastructure used for teaching. I enjoy coding and building tools, whether it's exploring biological data, creating workflows, or helping students get the most out of our teaching infrastructure.

Outside of work, I enjoy playing tennis, practicing guitar, and drawing.

Technical Expertise

Bioinformatics

Genomics, Transcriptomics, Variant Analysis, Functional Annotation, Image Analysis

Programming

R, Python

Data Analysis

tidyverse, Pandas, Data Visualization

Workflow Management

Snakemake, Conda

Version Control

Git, GitHub

Automation & Systems

Linux, Ansible, Server Management

Command-line & HPC

Bash scripting, AWK, SLURM

Web Tools

HTML, CSS, Shiny

Projects

Enlarged Cancer Cell Detection

Developed a pipeline for detecting enlarged cancer cell nuclei in breast cancer tissue microarray (TMA) images. The workflow combines QuPath and Stardist for nuclear detection and feature extraction, with a Snakemake-based automation pipeline to streamline analysis.

The project is ongoing and the pipeline will continue to be developed by other team members. The code is publicly available on GitHub.

Snakemake QuPath Stardist Image Analysis Python R Bash AWK Groovy
Source Code

Interomics

A Shiny-based web application for metagenomic data exploration and visualization. Built in R using phyloseq, metacoder, and plotly, Interomics enables users to generate interactive diversity, abundance, and taxonomic tree plots directly from their data tables.

R Shiny metacoder phyloseq plotly
Live App Source code

Runemark Lab Website

Designed and developed the official website for the Runemark Lab from scratch using HTML, CSS, and JavaScript. I handled the design, setup, and deployment, creating a fully functional and publicly accessible site for the lab.

HTML CSS JavaScript Web Design
Visit Site

Copy Number Analysis of scWGS Data

Performed single-cell whole-genome sequencing (scWGS) analysis to investigate genomic instability in drug-resilient breast cancer cells. This project contributed to the publication "Drug-resilient Cancer Cell Phenotype Is Acquired via Polyploidization Associated with Early Stress Response Coupled to HIF2α Transcriptional Regulation".

The analysis pipeline was developed in Bash using established bioinformatics tools for alignment and copy number inference.

Bash bwa samtools eagle CHISEL
Publication Source Code

Identification of Functional Synonymous Variants in Breast Cancer

Initially developed during my Master's thesis and now extended in my PhD, this computational pipeline identifies potentially functional synonymous variants across breast cancer cohorts. The workflow integrates variant calling, annotation, and predictive modeling using GATK, VCFtools, VEP, MMSplice, and custom Python and R scripts. The pipeline pinpointed candidate variants for experimental validation and forms the foundation for ongoing functional studies.

GATK VCFtools VEP MMSplice Python R Breast Cancer
Source Code

Deregulation of the Splicing Machinery in Breast Cancer

This ongoing project investigates the expression and function of splicing factors across breast cancer subtypes using SCAN-B RNA-seq data. The analysis integrates gene expression, alternative isoform usage, and somatic variant data to identify regulatory changes in splicing factors and their potential impact on target gene networks.

R Python RNA-seq edgeR Majiq Variant Calling Gene Set Enrichment Analysis Pathway Enrichment Analysis
Available upon request

Contact Me