Technology

Art

Data-driven unification of 72 biomedical publication types and study designs into a hierarchical rubric

Smalheiser, Neil R, Menke, Joe D, Holt, Arthur W

19d ago· 21 min readenInsight

education science biomedical research classification data-driven methodology

Summary

This article presents a data-driven approach to unifying 72 biomedical publication types and study designs (PTs) into a single rubric and hierarchy. The researchers computed pairwise similarities of each PT against all others to form a similarity matrix, then performed hierarchical clustering to categorize them. Spearman correlations among PT pairs ranged from strongly negative to strongly positive (−0.732 to +0.997), with a mean of 0.176. The analysis yielded 13 clusters of PTs and 5 broader categories, providing a unified framework for classifying biomedical research types.

Source

bskyData-driven unification of 72 biomedical publication types and study designs into a hierarchical rubricacademic.oup.com

Key quotes

· 4 pulled

Our goal is to unify the 72 biomedical publication types and study designs (collectively, PTs) into a single rubric and hierarchy.

This is carried out in a data-driven manner by computing pairwise similarities of each PT against all others to form a similarity matrix.

Spearman correlations among PT pairs ranged from strongly negative to strongly positive (−0.732 to +0.997), with a mean of 0.176.

Overall, we obtained 13 clusters of PTs and 5 broader categories.

Snippet from the RSS feed

Abstract. Our goal is to unify the 72 biomedical publication types and study designs (collectively, PTs) into a single rubric and hierarchy. This is carrie

You might also wanna read

Statistical Analysis Reveals DSM-5 Disorders Don't Align with Natural Symptom Clusters

A groundbreaking study published in Clinical Psychological Science uses statistical clustering methods to analyze DSM-5 psychiatric symptoms

psychiatrymargins.com·9mo ago

Institutional Books: A 242B token dataset from Harvard Library's collections

arxiv.org·1y ago

ICLR 2026 Affiliation Dataset: PDF-derived institutional data for 5,356 accepted papers with treemap visualizations

A GitHub repository provides an end-to-end pipeline that extracts institutional affiliations from the PDF title blocks of 5,356 ICLR 2026 ac

github.com·1mo ago

GPTZero Analysis Finds 100+ Hallucinations in NeurIPS 2025 Accepted Papers

GPTZero's analysis of 4,841 papers accepted by NeurIPS 2025 reveals at least 100 papers contain confirmed hallucinations, including fabricat

gptzero.me·5mo ago

Research on Hierarchical JSON Representations for Preserving Scientific Sentence Meaning

This research paper investigates whether structured hierarchical JSON representations can effectively preserve the meaning of scientific sen

arxiv.org·2mo ago

Research Team Collects 10,000 Hours of Neuro-Language Data for Thought-to-Text Models

A research team has collected approximately 10,000 hours of neuro-language data from thousands of individuals over six months, claiming it t

condu.it·6mo ago

Comments

No comments yet. Be the first.