Datadog's Dash 2024 has just come to an end, and as an observation cloud that is competing with Datadog, we have carefully analyzed every new feature of Datadog, and found some very interesting things, and we will give you a comprehensive analysis today. (All of Datadog's latest features for Dash are from https://www.datadoghq.com/blog/dash-2024-new-feature-roundup-keynote/, so you can refer to the original instructions.) )
Part 1: DASH 2024 Keynote Roundup
Observational capabilities
1、LLM Observability
Compared with LangSmith, which can only debug Agents, Datadog's introduction of the entire function can extend the observation of the whole life cycle to Agent development, not just debugging in Agent development, which is certainly more valuable than LangSmith alone.
This ability has also been developed for a period of time, and it is expected to meet you in the near future~
2. Better compatibility with Otel's Collector's DDAgent
Datadog has finally officially and completely incorporated Otel into its own system, and now the standardization trend of Otel can no longer be shaken, and any Instrumentor, no matter what technology is used, will inevitably move closer to Otel from the data structure, which is also a compromise for commercial companies, so it is not important what means the future observation world uses to obtain data and what means to observe, but everyone has chosen a unified data structure and paradigm. Datadog has been supporting Otel's data structures for a long time, but this time it was possible to use DDAgent as an Otel Collector.
For the observation cloud, this may be supported on day one, especially in the environment of China, there are also solutions such as injecting skywalking, early zipkin, jaeger, etc., so the observation cloud is not only Otel's Collector from day one, but also the Collector of various other technical methods, which also means that at least in terms of compatibility, the observation cloud is more extensive than Datadog.
3. LogWorkspaces with powerful data analysis capabilities
Logs are always an important part of observability data, and support for logging capabilities is also the focus of Datadog's strengthening. This time, the SQL-based log analysis workbench was introduced.
The observation cloud itself provides a unified data language, DQL, which means that not only logs, but all data can be analyzed very personally through DQL. Of course, Observation Cloud does not currently publicly support SQL analysis, but Observation Cloud itself is an MPP data warehouse, and we have not opened up SQL capabilities for the time being, but in fact, similar effects can be achieved by using DQL, and the variety of data that can be analyzed far exceeds that of Datadog.
4、Live Debug
For programmers, the god-level function, to a certain extent, the observation and monitoring platform is not only a platform for operation and maintenance, and the biggest difference compared with the traditional monitoring system is that it is more positioned as a remote debugging platform, so it is very happy for every programmer to be able to perform live debugging on the production environment.
Observation Cloud currently has the relevant technology, and there is no idea of productization in the short term, and for everyone, what do you think about the ability to directly inject code online for debugging?
5. Analytical ability for product interaction design
This feature is an enhancement to the original Rum, including the addition of Session Replay, Heatmap, Sankey Analysis and other capabilities, it seems that Datadog itself as an excellent interactive product, really favor front-end development engineers.
The observation cloud itself is also very favored by front-end development engineers, and we are also adding related capabilities~ I look forward to seeing our capabilities such as Heatmap and Sankey Analysis in subsequent updates.
Security capabilities
Datadog continues to strengthen his security capabilities, in this part because the observation cloud does not currently have any idea of entering security, so it is not interpreted, interested friends can check the original article for themselves.
Action/Execution
From here, Datadog's hand has begun to grow, and in addition to observation, it has also begun to enter the field of control, but unlike traditional Chinese-style operation and maintenance control, Datadog emphasizes more control through data.
1. Automatically scale up and down for Kubernetes machines
Datadog can now manage your Kubernetes clusters manually or automatically based on your policies through billing data or monitoring data.
The observation cloud also provides control capabilities, through the Func platform, the observation cloud can also provide corresponding control capabilities, but compared with Datadog, we do not directly provide this ability, think about the Chinese market, a cloud application can directly manage your infrastructure and applications, or quite scary, I don't know if everyone accepts it?
2. Combine changes with alarms
Datadog supports retrospective changes when an alarm occurs, and can observe code changes to quickly assist engineers in locating problems. This is another very useful feature from the perspective of R&D, so you don't have to find the version to go through the code by yourself.
Observing clouds doesn't have this feature at the moment, but it's already on the agenda.
3. Automatic root cause analysis Bits.AI large model
This is one of Datadog's own capabilities combined with large models, through the RAG synthesis of observability data, to produce a certain guiding analysis.
Observation clouds are also adjusting Prompt and Workflow for better results.
4. Improve the observability analysis experience of OnCall
Datadog has its own app, which has recently been enhanced to give engineers who receive OnCall on mobile a better experience and a better data analytics experience.
Observation Cloud also has its own APP, but frankly speaking, the overall capability is still far from Datadog.
Part 2: DASH 2024 Infrastructure Roundup
Cloud expense management
Datadog has enhanced its cloud expense management capabilities to include the following capabilities:
1. Centralized analysis of the cost management of all cloud services, including the cost statistics support of some SaaS services
2. It can monitor and manage changes in cloud costs
3. Provide cost recommendations for AWS
4. The cost of supporting Twilio (cloud communication).
However, due to the powerful configuration capability of Observation Cloud, in fact, many of our users are using Observation Cloud to analyze, manage and monitor their Alibaba Cloud and Huawei Cloud AWS expenses.
Serverless 监控
1. The application of remote instrumentation Lambda
2. Provide comprehensive visualization support for AWS Step Functions
3、自动插桩 Azure App Service Linux Web Apps
4、自动插桩 Google Cloud Run services
It can be seen that Datadog continues to strengthen its support for Lambda-type function computing capabilities, and it also sees Datadog's extensive support for multi-cloud. For the observation cloud, this part is backward, and we are currently only implementing support for AWS Lambda through AWS's open-source Lambda Layer Extension, and our own Layer Extension is under development. So it's going to take time to catch up.
Log management
1. Desensitization of data collection through DDAgent
2. Flex Logs, a cheaper log storage solution
For the enhancement of log management, first of all, a large number of technologies of the observation cloud are placed on the client side at the beginning, so the device-side desensitization is supported by the observation cloud through the pipeline from the beginning. Contrary to Datadog, we have just provided center-side processing power, including redaction.
Logs themselves provide cheaper tiering and are also the goal of the Observational Cloud effort, and we look forward to seeing our interesting storage solutions this year.
Network monitoring
1. Find the problem in the network path
2. Learn about the IP address from the IP library
3. Ability to monitor network performance
4. Append a tag to the custom-discovered network device
Like the observation cloud, Datadog's local network monitoring capabilities are also added later, and it is also considered to be catching up in the npm field, and the observation cloud may still be relatively weak in network equipment monitoring-related capabilities, compared with Zabbix, we are also currently grasping the completion of this part of the capability.
Analytical skills
1、DDSQL Editor
2. Rapid graph-based root cause analysis
3. Better alarm analysis panel
4. Infrastructure failure and change correlation
These two analysis capabilities for the observation cloud, the first is based on the DQL capabilities that we have now, except that it is not SQL, we can do similar autonomous analysis for a long time, of course, it can also be based on SQL, if you use our deployment version of the customer, you can actually open the SQL entrance.
The second feature, which is another function that combines large models, is very inspiring to us, and we look forward to providing similar capabilities in the future.
The third feature is well worth learning from, and we'll be looking into a similar capability as soon as possible.
The fourth change correlation analysis capability is also part of Datadog's full change observation, which will be reflected when the overall change observation analysis feature is introduced.
Platform capabilities
1、Datadog Disaster Recovery
2、通过 Fleet Automation 管理 DDAgent
3. Supported the U.S. government proprietary cloud
In terms of these, Datadog Disaster Recovery is Datadog as a SaaS that gives administrators a super authority to carry out behavior, obviously in order to gain the trust of large enterprises, of course, at this point, the observation cloud itself provides OP mode, and the console in OP mode has this ability.
而 Fleet Automation 对应的是观测云的 DCA(Datakit Control Administer),可以帮你轻松的管理所有的 Agent。
Regarding the support for the U.S. government's proprietary cloud, I would like to say that Observation Cloud has just obtained the compatibility and adaptation certification of Alibaba Cloud Feitian Proprietary Cloud, and can provide a full range of services for all Alibaba Cloud Proprietary Cloud users. Of course, we also support HCS from HUAWEI CLOUD and TCS from Tencent Cloud.
Part 3: DASH 2024 Applications Roundup
Enhancements to APM and continuous tracking
1. Improved the ease of configuration of APM probes
This user experience part of Datadog has been strengthened, and before observing the cloud, it was the same configuration flow as Datadog, rather than a simple installation flow like Newrelic Dynatrace, of course, we will compromise as soon as we see Datadog compromise. (But it is true that this kind of suggestion flow will have many problems in actual use, and it is more suitable for gaining the user's favor and simple application at the beginning, which will be explained in an article later)
2. Understand the health of the service
3. Waterfall mode with distributed link tracing
This capability observation cloud was supported almost two years ago, and it's great to see that Datadog will support it in 2024, and it's almost the same as ours.
4. Analyze the profiling capability of the runtime
At present, the observation cloud is supporting the indicator extraction function of Profiling data, which will add more analysis indicator timelines (of course, it will also increase the cost), as for whether to provide such an analysis capability in the future, we will first understand the customer (mainly increase the cost), Datadog is of course very expensive.
5、Go 语言的 Profiling CPU Cost 显著下降 14%
Observe the cloud and the profiling component of ddtrace, if you use this component, you will naturally get this ability.
6. The application of automatic analysis of memory leakage trends
Very good ability, observing the clouds will follow as soon as possible.
Data services are observable
1、Data Jobs Monitoring 监控大数据传输处理
2、Data Streams Monitoring 支持更多的数据产品(Spark jobs, S3 buckets, Snowflake tables)
3. Track downstream data consumption
4. Automatically discover PostgresQL and Kafka with Datadog USM
5. Directly monitor and manage Snowflake
6. PG schema observability support
For the overall monitoring and observation scheme of Data, the observation cloud does lag behind Datadog a lot, because the overseas technology ecology, both the database and the big data system are relatively unified, and there are not so many open source branches, which makes Datadog relatively standard in this matter and can provide standardized products. Of course, the observation cloud itself has not invested too much in this part, and we are currently thinking about injecting cooperation with China's own products such as AutoMQ and Oceanbase to jointly create a complete set of comprehensive observation solutions for data processing processes.
Digital experience analytics enhanced
1. More powerful front-end performance analysis assistance
Observation Cloud is also constantly refining its Rum page analysis capabilities, which are very good and we will introduce them as soon as possible.
2. Use real user traffic data to reveal problems in the code
This is another feature that is a great way to improve the front-end engineer experience, bringing together all the elements of Rum for engineers to analyze, and we will consider supporting this capability.
3、支持 Rum session Replay的尾部采样
This feature has been supported by observation clouds for a long time, and the obtained session replays can be sampled through Datakit, such as only collecting replays with errors.
4、支持 Unity SDK
Another capability that was earlier supported by Observation Cloud, which already supported Unity apps.
5. Crash report integration for hybrid programming applications
This piece of ability Datadog has consistently done a very good experience, let's do it well.
6. Optimize the integration of browser SDKs
This web-based SDK is easy to inject and easy to observe clouds.
7. Reproduce the error via the VScode plugin
It's another feature that spoils programmers, Datadog really spoils programmers, I believe everyone is very interested in this ability, but is the company willing to pay more?
DASH 2024: Guide to Datadog's newest announcements for security
Datadog continues to strengthen his security capabilities, in this part because the observation cloud does not currently have any idea of entering security, so it is not interpreted, interested friends can check the original article for themselves.
DASH 2024: Guide to Datadog's newest announcements for teams
Service reliability is related to delivery
1. Team Dora indicator observation
This feature is also a governance aspect, which is an integrated dashboard, and if anyone needs it, the observation cloud can also provide similar dashboards, and of course it can provide more dashboards.
2. Large observation screen of the overall SLO
This capability is the same as above, and it is also an integrated dashboard, and the observation cloud also has its own dashboard for SLO, with a different style.
Team data accessibility
1、Datadog CoTerm
After Datadog acquired CoTerm, it integrated the capabilities of CoTerm, but the first capability was actually a collaborative terminal, which is equivalent to providing a bastion-like capability.
2. Cross-organizational data analysis
This capability observation cloud is estimated to have been around for more than a year, and the observation cloud can also unionize data from different organizations, and hopefully Datadog will soon keep up, after all, DDSQL has already been offered.
3、Datadog App Builder
Datadog's Dashboard can build interactive applications through AppBuilder, about this ability in fact, the observation cloud also has, of course, the user experience is slightly inferior, if you need to know, you can choose the command space in the Dashboard of the observation cloud, and then write the corresponding execution function in the Func of the observation cloud, you can turn the Dashboard into an interactive application.
Online sheet analysis capabilities
This is a very friendly feature. Allowing you to analyze exported CSV files without local Excel, Datadog provides an online Excel grid CSV analysis capability.
Manage sensitive data
We fully supported both of these features last year. The Observation Cloud is a Fortune 500 company that pays attention to security and compliance, and they are also big users of Datadog.
summary
Datadog, as the current leader in the field of global monitoring and observation, is very worthy of learning from latecomers such as Observation Cloud, if you carefully look at some of the new features and improvements displayed in Datadog Dash 2024, you will find a few points:
- Datadog is trying to continuously bring together people from the IT team through a single platform
- Datadog has a strong focus on pleasing engineers, with a strong focus on user experience, and conveys the idea of respecting every engineer
- Datadog began to expand its boundaries to include security sections that weren't mentioned
In addition, we are very proud to say that the overall design ideas and concepts of the observation cloud are almost the same as Datadog, so there will be many functions and even earlier support for the observation cloud, because we believe that many functional requirements come from the end user, and the user we are facing is a kind of user, so there will be a lot of similar ideas. (Including the case management released by Datadog last year, almost the same month as the anomaly tracking feature of the observation cloud).