External reviews
External reviews are not included in the AWS star rating for the product.
Built to accelerate development
What do you like best about the product?
I have been using databricks for almost 4 years and it has been a great asset to our development as a team and our product.
Shared folders of re-usable and tracked notebooks allow us to work on tasks only once, minimising duplication of work, which in turn accelerates development cycle.
One of my personal favourites are the workflows, that allowed us to automate a variety of tasks, which availed capacity for us to focus on the right problems at the right time.
Another great selling point for me, is that collaborators can see each other typing and highlighting live.
Shared folders of re-usable and tracked notebooks allow us to work on tasks only once, minimising duplication of work, which in turn accelerates development cycle.
One of my personal favourites are the workflows, that allowed us to automate a variety of tasks, which availed capacity for us to focus on the right problems at the right time.
Another great selling point for me, is that collaborators can see each other typing and highlighting live.
What do you dislike about the product?
UX could be improved
While I appreciate the addition of new features, developments and experiments, the frequency of changes made it tiring and frustrating for me recently.
Too much, too frequently. The 'new notebook editor' is a great example here. The editor itself could be a very useful change, but changing all the keyboard shortcuts at the same time without letting the user know is questionable to me.
I would prefer it, if changes were rolled out less frequently with detailed patch updates (see Dota 2 for example), and configurable options in the user settings.
E.g. I would use the experimental 'new notebook editor' if I could keep the keyboard shortcuts the same.
Less frequent, more configurable updates please.
One of the biggest pain point for me is the Log In and Log Out process. Why does Databricks have to log me out every couple of hours? Especially while I am typing in a command cell?
Could this be improved please?
Also, would love it if libraries on clusters could be updated without having to restart the cluster.
Having said all this, I do love some of the new features, such as the new built-in visualisation tool, however would love it even more if titles could be added and adjusted.
While I appreciate the addition of new features, developments and experiments, the frequency of changes made it tiring and frustrating for me recently.
Too much, too frequently. The 'new notebook editor' is a great example here. The editor itself could be a very useful change, but changing all the keyboard shortcuts at the same time without letting the user know is questionable to me.
I would prefer it, if changes were rolled out less frequently with detailed patch updates (see Dota 2 for example), and configurable options in the user settings.
E.g. I would use the experimental 'new notebook editor' if I could keep the keyboard shortcuts the same.
Less frequent, more configurable updates please.
One of the biggest pain point for me is the Log In and Log Out process. Why does Databricks have to log me out every couple of hours? Especially while I am typing in a command cell?
Could this be improved please?
Also, would love it if libraries on clusters could be updated without having to restart the cluster.
Having said all this, I do love some of the new features, such as the new built-in visualisation tool, however would love it even more if titles could be added and adjusted.
What problems is the product solving and how is that benefiting you?
Databricks is used as the core of our research environment.
It is used to provide quick and efficient analysis of whatever question or problem might arise while keeping the production environment safe and undisturbed.
It is used to provide quick and efficient analysis of whatever question or problem might arise while keeping the production environment safe and undisturbed.
- Leave a Comment |
- Mark review as helpful
Databricks usage for job creation and cluster management and manage spark jobs effectivly.
What do you like best about the product?
Easy to schedule and run jobs and integrate with airflow and azure storage accounts.
Easy to execute code cell-wise and debug the errors because of its interpreter.
Easy to execute code cell-wise and debug the errors because of its interpreter.
What do you dislike about the product?
It won't give auto-fill suggestions while coding like how other IDEA's gives.
What problems is the product solving and how is that benefiting you?
We use for our data engineering projects for large scale datasets.
Great Collaborative Platform for Data Science Projects
What do you like best about the product?
I have been using Databricks platform for business research projects and building ML models for almost a year. It has been a great experience to be able to run analysis and model testing for big data projects in a single platform without switching between SQL server and development environment with Python, R, or Stata. Also, I like the fact that MLflow can track data ingestion for any data shift in realtime for model retraining purposes.
What do you dislike about the product?
We have had issues using MLflow and feature store on Databricks for ML projects, which slows down the development process. Wish there was better documentation on these tools or more diverse examples to demonstrate different use cases. Also, the test-train split with MLflow does not support time series time interval test-train split for model validation purposes.
What problems is the product solving and how is that benefiting you?
The Databricks lakehouse platform allows the data science team better work with the development team in a single platform, which help improve ML project development in the long run.
Best data all in one solution
What do you like best about the product?
Pyspark, Delta lake, The way that it integrates seamlessly with AWS services and how they managed to open source everything. It provides a great managed spark infrastructure.
What do you dislike about the product?
Harder to integrate with more legacy data sets. Requires you to move data into AWS to use.
What problems is the product solving and how is that benefiting you?
Databricks is creating a solution that allows us to query and manage our data lake with immense performance. Delta lake ensures ACID transactions on data and the query performance from databricks is unmatched
Good experience so far!
What do you like best about the product?
Great unification of functions & features and data sharing across the organization.
What do you dislike about the product?
There's still a lot to learn and make sure that all the functions I use work well and properly. Nothing bad, just more to find out.
What problems is the product solving and how is that benefiting you?
It's helping me do my job and unifying data sources across all my different work streams.
User friendly and intuitive platform
What do you like best about the product?
As a Cloud Operation Specialist, I deploy the databricks workspace, setup and manage the clusters. It’s easy to setup and manage the users within the workspace.
UI is very user friendly and intuitive.
UI is very user friendly and intuitive.
What do you dislike about the product?
Error messages can me more detailed and explained well.
What problems is the product solving and how is that benefiting you?
Highly efficient in executing queries and analysing data.
Powerful platform
What do you like best about the product?
The platform is powerful and flexible enough to do almost anything you want to do, like ETL, ML models, data mining, simple adhoc queries, etc. Also easy to switch languages between python, sql, r, scala, etc. anytime you want.
What do you dislike about the product?
The search function is not my favoriate, I often like to use the search function from the browser but it doesn't work well with scripts in a big cell. Also the clusters takes a while to start.
What problems is the product solving and how is that benefiting you?
It meets all my data mining and data science project needs. Simple and easy to use.
The most flexible and potent data platform available, without a doubt
What do you like best about the product?
The most reliable and user-friendly option for creating ELT pipelines that employ Python, Spark, and SQL is Databricks. Configuring and deploying it doesn't take much labour, and it frees developers from having to worry about setting up the infrastructure.
What do you dislike about the product?
using the same cluster to perform several streaming tasks
Since shutdown immediately following the job run/fail is configured by default, job clusters cannot be reused even for the same retry in PRODUCTION. Checking potential ways to raise this limit.
Since shutdown immediately following the job run/fail is configured by default, job clusters cannot be reused even for the same retry in PRODUCTION. Checking potential ways to raise this limit.
What problems is the product solving and how is that benefiting you?
comprehensive Batch & streaming pipeline
Alps Lake
History and versioning
Delta log transaction with ACID
Validation and quarantine are methods of data curation.
Information Ingestion Using an Autoloader
Alps Lake
History and versioning
Delta log transaction with ACID
Validation and quarantine are methods of data curation.
Information Ingestion Using an Autoloader
Intuitive and Powerful
What do you like best about the product?
As a frequent user of Databricks, it has made my life so much easier by simplifying processes and allowing me to develop proof-of-concept designs rapidly. The orchestration of notebooks via workflows provides excellent visualization and enables me to conduct real-time demos for members on the business side. In addition, the integration with Azure and AWS makes it so that Databricks does not operate in isolation and allows me and other engineering team members to transform large amounts of data that is ingested via our enterprise pipelines.
What do you dislike about the product?
There can sometimes be issues integrating Databricks workflows with open source frameworks, often requiring lots of debugging and trial and error. Additionally, I've been told that the platform can be pretty expensive.
What problems is the product solving and how is that benefiting you?
The Databricks Lakehouse Platform allows me to create and deploy workflows to orchestrate and test proof-of-concept ideas in our organization. This will enable us to validate ideas and develop presentations for the organization's business side.
Progressing in the right direction
What do you like best about the product?
Being quickly able to get the environment up and running for any kind of workloads. The support for all three languages and catering to the needs of Data Engineering and ML.
What do you dislike about the product?
Too many customizations are needed to achieve the right mix of parameterization for optimal performance. On the other hand, snowflake provides lots of features out of the box without the developer worrying about these things.
What problems is the product solving and how is that benefiting you?
Managing the intermediate layers and data engineering activities like wrangling/mashing/slicing/dicing of the data well. Greater control of the data via data frames.
showing 231 - 240