Midterm Report
The coding period leading up to the Midterm Evaluation has been an amazing experience for me. I’ve organized this report into two distinct parts, with each part dedicated to a specific phase of the project up to the Midterm Evaluation.
Phase 1 30th May - 30th Jun
Delved into relevant research papers, understood the intricacies of assembly runs, samples, studies and analyses, which form the foundational concepts of MGnify.
Documented all visualizations present on the MGnify website which was extensively used in Phase 2.
Significant chunk of Phase 1 was spent contributing and collaborating with Alejandra Escobar (MGnify Bioinformatician) on the enhancement of her Jupyter notebook - Drawing presence/absence KOs for one metagenomic sample, specifically focusing on Pathways Visualization.
The objective was to streamline the notebook, making it more concise, interactive, self-explanatory, and immediately usable.
Technologies Involved : R + Docker + JupyterLab + Quarto + Packages like KEGGREST, Pathview
Pull Request :
Worked on an existing PR by Alejandra, Link : https://github.com/EBI-Metagenomics/notebooks/pull/26
Status : Under Review 🟡
Code in Action :
: : : : : : : : : : : : : : : : : :
Phase 2A 1st Jul - 16th Jul
This phase focused on finding visualization solutions that could work seamlessly in both JavaScript and Python environments.
Involved testing of packages like Plotly and Highcharts were tested, with a preference for Highcharts due to its usage in the MGnify site.
But then Highchart posed an issue as it couldn’t reference data as a link, impacting interoperability between React and Jupyter, especially with large datasets.
Then came the idea of a common visualization grammar, enabling easy porting of visualizations to Jupyter notebooks for users to perform deeper visualization and push in custom data as well.
Alternatives like pandas-js, Danfo.js, and Vega-Lite were considered.
In the end, the Vega ecosystem was selected due to its compatibility and ability to handle large datasets, ensuring seamless interoperability.