Why I Still Use Python for High Performance Scientific Computing

In 2013 Campello, Moulavi and Sander published a paper on a new clustering algorithm that they called HDBSCAN. In mid-2014 I was doing some general research on the current state of clustering, particularly with regard to exploratory data analysis. At the time DBSCAN or OPTICS appeared to be the most promising algorithm available. A colleague ran across the HDBSCAN paper in her literature survey, and suggested we look into how well it performed. We spent an afternoon learning the algorithm and coding it up and found that it gave remarkably good results for the range of test data we had. Things stayed in that state for some time, with the intention being to use a good HDBSCAN implementation when one became available. By early 2015 our needs for clustering grew and, having no good implementation of HDBSCAN to hand, I set about writing our own. Since the first version, coded up in an afternoon, had been in python I stuck with that choice -- but obviously performance might be an issue. In July 2015, after our implementation was well underway Campello, Moulavi and Sander published a new HDBSCAN paper, and released Java code to peform HDBSCAN clustering. Since one of our goals had been to get good scaling it became necessary to see how our python version compared to the high performance reference implementation in Java.

This is the story of how our codebase evolved and was optimized, and how it compares with the Java version at different stages of that journey.

To make the comparisons we'll need data on runtimes of both algorithms, ranging over dataset size, and dataset dimension. To save time and space I've done that work in another notebook and will just load the data in here.

Why I Still Use Python for High Performance Scientific Computing

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Bureau of Internal Revenue: Regional Offices (Directory)

Adolescence A Stage of Growth and Change Class 7 Extra Questions and Answers...

Vocational Training Instructor (Carpenter) at States of Jersey

'My best friend looked possessed, then he stabbed me', teenager tells court

[Visual Studio] 開発ツール対応 OS 一覧

I want to a weather coin buyer genuine buyer r welcome

Karimnagar District Tahsildars Phone Numbers-Mobile Numbers Telangana-State

(get) Tej Dosa Letter 81 - How To Make An Extra $200-$500/Week (In 2025)

JACOB FORREST OGDEN Arrested by Clackamas County Sheriff's Office on Dec 30,...

$22.6m payout to workers fired under UNC govts

SAHARA FLASH LIVE IN WERAGOLLA 2018-04-20

HP P2000 Storage Error Controller A Unknown Issue Resolution Request

The 10 Tennessee Cities With The Largest Black Population For 2021

Named and shamed: a round up of cases heard by Essex magistrates

FortiLink mode supported over a layer-3 network

ページングファイルサイズの推奨設定とその背景について

ZARIA CUMMINGS

Serial child killer David Threinen’s reign of terror

Philly Mobster Ronnie Turchi Took Last Ride In October ’99, Turned Up Trunk...