April 27, 2024

Healthcheck Part 1. How we support 1000+ Make.com scenarios

Healthcheck Part 1. How we support 1000+ Make.com scenarios

:handshake: 1. Introduction

Our team has been professionally developing automations for Enterprise clients for 5 years. During this period, we have created more than 1000 scenarios in 30 Make.com accounts. We strive to maintain business process uptime at 99.999% and to ensure high performance so we do need to instantly and promptly respond to any errors that occur in scenarios and lead to their stopping. We identified two main problems that negatively affect maintaining the uptime of business processes at the highest level.
This is the first part of the article, which is specifically about using Make API to build a monitoring system for the status of scenarios. Our system is called HealthCheck and today it performs over 1.2 million checks each month.
Let’s dive into the problems first.

:smiling_imp: 2. The issues we faced when working with Make.com:

1. Make.com’s error alerts do not provide full control over the status of scenarios out of the box.
The standard tools offered by Make.com for tracking errors are email alerts about scenario stops, as well as internal alerts within the Make.com account. Before the launch of Make API, we used an email parser and received error notifications from it. However, this method appears to be outdated and inconvenient: emails arrive constantly, information in them comes with a delay, the need to parse each email also introduces errors and reduces stability. After the launch of Make API, we finally had the opportunity to do this without using email. This completely changes the logic of working with errors and allows everything to be done natively.

2․ Deleting scenarios or accounts due to human error.
The second problem, which may seem less noticeable at first glance but is extremely critical related to Make, is the deletion of scenarios or deleting the entire client’s account in Make.
It may seem funny, but we have encountered situations several times when employees of the organization on the client’s side, for one reason or another, deleted the most important scenarios in the Make account! It’s even worse when the client completely deletes their entire account in Make. It may seem strange, but so far, Make.com does not offer any opportunities to recover a deleted scenario or retrieve scenarios from a deleted Make account.

After such cases, the client is left with nothing: scenarios cannot be returned, and the account is deleted irretrievably. This leads to colossal financial and time losses, as all the work done on developing and configuring systems is destroyed in an instant!

:metal: 2. How the monitoring and backup system works.

Fortunately, our system using Make API allows us to recover a deleted scenario or even an entire account from a backup, which is created daily. Let’s have a look at how it works.
First, take a look at the stack of tools on which the monitoring system works:

1. Make scenarios. We use several scenarios that directly through Make API poll, check status, save the blueprint of the client’s scenarios. Connecting the client to the monitoring system is as simple as possible. We have a system account that needs to be added as an admin in the client’s Make.com account. After that, we get full access to all scenarios and start collecting, filling, and monitoring the client’s scenarios. Additional scenarios are used for the Slack bot and manage messages and buttons in them.

2. Dashboards and databases in Airtable. Here we store every active scenario from the connected Make.com accounts. In Airtable, we store parameters such as:

  1. Scenario URL
  2. Scenario name
  3. Date of the last check
  4. Current status of the scenario
  5. Team member responsible for this scenario
  6. Link to the backup folder on Google Drive, where the latest version of the scenario is stored
  7. Client data, linking scenarios to our projects from CRM.
  8. Necessary system IDs for Slack, Make, Google Drive.

We also use Airtable for auto tests, backup and statistics. Learn more about managing Airtable in the second part of this article here: https://www.vatech.io/case/healthcheck-part-2-how-our-clients-invest-in-their-businesses-stability

3. Google Drive. This is where folders are created for each client’s account. For all active scenarios, a JSON blueprint file of the current version of the scenario is saved. These files are used as a backup, and in case of scenario deletion, we quickly restore the current version and relaunch it back into operation.
4. Slack channel Errors. The monitoring polls every 5 minutes all scenarios, and when the status of the scenario becomes Turned OFF, we receive a detailed message in the channel of the following type:

The error message contains:

  • Client and scenario name, link to the scenario
  • Name of the responsible person
  • Details about the last error for quick understanding of the problem and the time it occurred
  • Buttons for quickly restarting the scenario. Often, it’s just necessary to rerun it. There’s no need to access Make. We can also set up auto-restart of the scenario in the Airtable dashboard.
  • A dropdown menu to disable notifications for the current issue for 0,5-24 hours. This is useful when an error occurs due to external services and it does not require our direct involvement, or the scenario is turned off because it’s currently being worked on.
  • A dropdown menu to assign the task to a team member.
  • The "Fixed" and "Track 10 minutes" buttons are used to update the information about the scenario and provide a quick time track for fixing it.

3. :+1: The advantages of the monitoring and backup system:

For our clients, the value of the system lies in:

  • The absence or minimal downtime of business processes. Currently, clients do not notice technical failures in scenarios. Now, when they write to us “Look, something broke here” we are already dealing with the problem, and most often, it has already been solved. Clients see the notification in their email about an error or scenario stop, but by the time they reach out, the scenario is already launched and working. Thus, clients can rely on us and be confident that any error automatically triggers the work process and does not require the involvement of client resources, or significantly reduces them.
  • Some clients like to figure things out themselves and be aware of everything happening with their system 24/7. For them, we additionally duplicate notifications in their Slack. This gives them additional control and transparency in the operation of their business.
  • People sometimes make mistakes, and a curious employee with access to Make can delete a scenario or even the entire account! Our system has saved businesses in such cases several times. For example, we could quickly detect and restore an accidental deletion of a complex critical scenario. It was the core of the system. The client’s losses from the backend downtime for a day would amount to $5000-10000, not including the cost of redeveloping the scenario. Restoring and setting up the backup took only 30 minutes instead of 2-3 days!
  • In short, our clients feel calm and secure. They are confident in the system’s stability and trust us.

For the development team, the benefits are as follows:

  • Immediate notification of problems allows focusing on important tasks here and now. This eliminates chaos and allows the error to be corrected at an early stage. We do not need to sift through hundreds of logs and look for the cause of the error. Every problem solved here and now simplifies the team’s work. Proper distribution to responsible directs tickets to the scenario creator, rather than to a random employee.
  • The system ensures the functionality of multiple heavily loaded scenarios that utilize unreliable APIs, such as automatically fixing server errors in HubSpot. On their side, this bug occurs on average 3-5 times daily. The system ensures the resumption of operation within 1 minute, saving 1 hour of developer work.
  • The ability to build more complex systems and create backup routes. If we are confident that we possess current information, we can use this knowledge in future tasks. Development becomes more pleasant and simpler because the developer knows that the error system can back him up.
  • In the case of mass problems on Make’s side or 3rd party services, we are the first to understand what happened because we see a similar pattern of errors across several accounts. This also helps to make a decision quickly. For example, this happened when Monday.com modules were incorrectly updated, and scenarios with Monday.com started to mass produce an error.

:rocket: Conclusion

In conclusion, our system ensures the seamless continuity of your business processes with minimal to no downtime, offering peace of mind and operational security. Our proactive approach means that by the time you notice an issue, we’ve already addressed it, ensuring uninterrupted workflow.

Contact us to make your business run smoothly and securely.

Want to learn more about our security system and the money it saves? We have utilized Airtable Databases and Interfaces for autotests, infographics, and more. We've also improved the system, and thanks to filters, we spend less on operations: 1.2 million checks each month cost us just $35. How? Check out the second part of the article here: https://www.vatech.io/case/healthcheck-part-2-how-our-clients-invest-in-their-businesses-stability


Key words:
Automation solutions Make.com / Error alert system for scenarios
Scenario backup and recovery / Custom API integrations
Monitoring and management / Error handling solutions
Comprehensive backup services / Enterprise Workflow optimization
Advanced automation techniques / Streamlined error resolution

Simmilar cases
Join 26+ companies trusting Value Added tech
tripleten logosendcloud logoallen morris companyImaguru logoCore Fabrics Logowelovenocode logoLabodet LogoTetra logo
tripleten logosendcloud logoallen morris companyImaguru logoCore Fabrics Logowelovenocode logoLabodet LogoTetra logo