Researchers face heavy burdens. Researchers are not only required to be subject matter experts in their field; they must also take on various roles. In addition to being experts in their subcategory of science, they must also serve as quality assurers, database administrators, data security experts, storage specialists, and more to meet the standards that the replication crisis has shown us are necessary.
This piece envisioned a different process for doing science, a process that leverages modern infrastructure. Software engineers spent decades developing tools and practices that enable speed and assure quality. They remove burdens by streamlining workflows with centralized platforms and automated tasks. In the best implementations, interfaces guide software engineers through their workflow, facilitating best practices through design. The problems software engineering solve for their knowledge product are highly similar to those facing researchers as they build their knowledge product.
We are building TrovBase to realize the vision of a better process, starting with data management.
TrovBase is a data management platform that facilitates best practices through an interface that prioritizes straightforwardness over optionality. We make it easy to validate and document through a centralized, simple interface. We don’t expect power users; we abstract away the technical database management skills required to implement best practices. This means researchers avoid bad practices in schema specification, data distribution, and everything in-between. This also means datasets are cleaner, more standardized, and have higher assured quality. Since we enforce structure on the dataset, we can automatically generate R and python scripts that do common data analytics tasks, written using the code that professional data scientists would use.
We reduce burdens, facilitate speed, and add integrity, just as the best software engineering tools do for developers. Our platform makes data management easier in a way that facilitates getting to analysis faster so researchers can focus more on what they do best.
Interested in becoming an early user or beta tester? Click here.
Interested in investing? Click here.
Interest in joining the team? Click here .
How does TrovBase work?
1. Build your schema using ready to go defaults
We prompt you to pre-specify data structures, enabling automated, validated, machine readable data entry and analysis coding.
2. Export results and collaborate
Once you’ve collected your data, we make it easy to customize extracts that are most conducive to your analysis, allowing subsetting and reshaping to get whichever data you need, aligned for your purpose.
We also provide rich collection metadata so collaborators have an auditable research trail.
3. Generate scripts for analysis
With the enforcement of consistent structure on your project’s data, we can generate Python or R scripts for best practice implementations of common analyses. Run your analysis from the cloud or your own machine.ake sure that data is machine readable so we can automate validation and analysis coding.
Tell me more
What’s your philosophy?
Optionality
Optionality has to pass cost-benefit. Optionality adds complexity. Each additional choice users encounter increases the burden of knowledge and adds an opportunity for misuse and error. Our product does not target power users. We are built for users to get up and running quickly, not for a hyper-specific configuration.
Best Practices
We facilitate best practices; we aren’t reinventing best practices. For instance, that’s why we aren’t an analytics dashboard and why we do generate scripts. We exist to facilitate easier access to the gold standard ways of doing analytics.
We are curating a board of experts in data management to advise us on implementation of best practices. View our advisors —> HERE.
Why not just have researchers use Git / GitHub?
First, what we offer is more than just a repository service. The schema enforcement means we get the data validation and the advanced data extraction in a robust way that simply using Git does not yield.
Second, researchers shouldn’t need to become power users of yet another platform (we don’t believe someone should have to be a power user to effectively use TrovBase). Researchers come from a variety of backgrounds and do a variety of tasks. We don’t believe that “amateur programmer” should be part of the job description for every contributor to a research project. As useful as Git is, it comes with both the cost of excluding people from research and the distraction from the day to day of doing core research work.
Who are you people?
We are a team that uniquely intersects the problem space: software and research. We met at George Mason University and have been learning from each other and teaching each other for 7 years. Sam is our CEO, Andrew is our CTO, and Nathaniel is our ideas guy / resident data science guru.
Follow the team on twitter.
🦔 Sam’s twitter
🐺 Nathaniel’s twitter
🐛 Andrew’s twitter.
How can I help?
We are seeking:
What type of entity are you?
We are a for-profit company. We are developing a product that initially targets scientific researchers. That’s our beachhead market. We anticipate that data will continue dominating the way most organizations operate, so this product will ultimately service anyone involved in the creation and curation of data.
Do you have other questions?
This is a work in progress. Help us clarify by shooting me an email: