Incidents | Enterspeed Incidents reported on status page for Enterspeed https://www.enterspeedstatus.com/ https://d1lppblt9t2x15.cloudfront.net/logos/86b6644d6228692624315f5453bfd731.png Incidents | Enterspeed https://www.enterspeedstatus.com/ en Management App API recovered https://www.enterspeedstatus.com/ Sun, 01 Jun 2025 03:39:49 +0000 https://www.enterspeedstatus.com/#4b3eb5bb7343a5b935989da5fa77addf3fa87e7b083f1afefe966aa27930be80 Management App API recovered Management App API went down https://www.enterspeedstatus.com/ Sun, 01 Jun 2025 03:04:49 +0000 https://www.enterspeedstatus.com/#4b3eb5bb7343a5b935989da5fa77addf3fa87e7b083f1afefe966aa27930be80 Management App API went down Some customers may experience an error loading the Management App https://www.enterspeedstatus.com/incident/581703 Tue, 27 May 2025 14:04:00 -0000 https://www.enterspeedstatus.com/incident/581703#d077f058f153b06336a0ba20127b4c718c639c1179aa335b826a0e9a45bc712e The issue appears to have been an intermittent issue from a third party component. The issue have been resolved, but may linger in some browsers. Please reach out on your Slack support channel or via support@enterspeed.com. Some customers may experience an error loading the Management App https://www.enterspeedstatus.com/incident/581703 Tue, 27 May 2025 13:51:00 -0000 https://www.enterspeedstatus.com/incident/581703#bcc3ac0eb425189588173f368f7682d99798a4daeaca890863efc31ee2c01ae1 Currently we have identified an issue preventing some users to access the Management App. We are currently investigating the issue and will report as soon as we know more. Management App API recovered https://www.enterspeedstatus.com/ Tue, 13 May 2025 22:42:06 +0000 https://www.enterspeedstatus.com/#689dac61d7c3f628e4a9cae86300579c4fbb793b9b584f51a010f01e2aac0de4 Management App API recovered Management App API went down https://www.enterspeedstatus.com/ Tue, 13 May 2025 22:34:01 +0000 https://www.enterspeedstatus.com/#689dac61d7c3f628e4a9cae86300579c4fbb793b9b584f51a010f01e2aac0de4 Management App API went down Management App login is unavailable https://www.enterspeedstatus.com/incident/557219 Tue, 06 May 2025 12:27:00 -0000 https://www.enterspeedstatus.com/incident/557219#87eafd989505398f6f3d28cb2e368f497c8f82dbb53e234acb914d957da84320 Today's issues was a result of a configuration issue in our deployment process that caused a secret to be set incorrectly. While we believed the issue had been fixed last week, a later update unintentionally reverted the change. Today a service was restarted after our system automatically recovered from a failed probe. This caused one of our services to fail in connecting with the data store as the secret was incorrectly configured. Resolution: We manually updated the configuration to restore services immediately after our monitoring reported problems. Steps to prevent similar failures in the future: * Improved how we manage sensitive configuration values in our infrastructure. * Added fallback behavior so the service remains operational even if a connection to data store isn't possible. Please reach out if you have any questions via your support Slack channel or via support@enterspeed.com Management App login is unavailable https://www.enterspeedstatus.com/incident/557219 Tue, 06 May 2025 06:28:00 -0000 https://www.enterspeedstatus.com/incident/557219#1a11554d909faccc2c3c54ccaae27e24db0f92e61d2046efe8d883274acfa822 The Management App is now fully available. We will update with more information as we learn more. Management App login is unavailable https://www.enterspeedstatus.com/incident/557219 Tue, 06 May 2025 06:19:00 -0000 https://www.enterspeedstatus.com/incident/557219#6da2d65a5ec5fc1d5ad234f9a69d037156e0b4030350392470374863d0c4f20c We are seeing an issue with logging in on the Management App. We are currently investigating. Management App API recovered https://www.enterspeedstatus.com/ Wed, 30 Apr 2025 02:58:23 +0000 https://www.enterspeedstatus.com/#a3d46efc5504838fde7c529709172e4e237c1043fabfc2d4a7c4018f65eb99da Management App API recovered Management App API went down https://www.enterspeedstatus.com/ Wed, 30 Apr 2025 02:53:13 +0000 https://www.enterspeedstatus.com/#a3d46efc5504838fde7c529709172e4e237c1043fabfc2d4a7c4018f65eb99da Management App API went down Management App API recovered https://www.enterspeedstatus.com/ Fri, 25 Apr 2025 19:08:29 +0000 https://www.enterspeedstatus.com/#bb2b547ce505262db53784722bd45dd2fccefb92d87c5f297aaae54b28f210a7 Management App API recovered Management App API went down https://www.enterspeedstatus.com/ Fri, 25 Apr 2025 18:51:19 +0000 https://www.enterspeedstatus.com/#bb2b547ce505262db53784722bd45dd2fccefb92d87c5f297aaae54b28f210a7 Management App API went down Management App API recovered https://www.enterspeedstatus.com/ Fri, 25 Apr 2025 18:32:28 +0000 https://www.enterspeedstatus.com/#6a27f8786d465dc365b15c8611e096f9274b911ba9a897325578ff13cf085f48 Management App API recovered Management App API went down https://www.enterspeedstatus.com/ Fri, 25 Apr 2025 18:27:19 +0000 https://www.enterspeedstatus.com/#6a27f8786d465dc365b15c8611e096f9274b911ba9a897325578ff13cf085f48 Management App API went down Ingest API returns errors in some cases https://www.enterspeedstatus.com/incident/543102 Thu, 10 Apr 2025 12:20:00 -0000 https://www.enterspeedstatus.com/incident/543102#b5f44e4efbb8195ce2869342d35186e17fad4c465bd507690c5b954bb5190fa3 We have investigated the cause of the errors earlier today. The cause of the problems was due to a new JSON serialiser that was part of today's deployment. As a side note, we can share that we process a lot of JSON, and we can gain a significant performance improvement with the new JSON serialiser, but we digress. In one specific use case, we had not tested the new code, which resulted in 0.83% of Ingest API requests returning an error response in a scenario where the requests were supposed to be valid. We will update our automated tests to include this scenario, so this will not happen again. We apologise for the inconvenience. If you have any questions, then please reach out to support@enterspeed.com or via your Slack support channel. Ingest API returns errors in some cases https://www.enterspeedstatus.com/incident/543102 Thu, 10 Apr 2025 12:00:00 -0000 https://www.enterspeedstatus.com/incident/543102#4e5b37e7b39df198ed73c9359fb26cabbf3514242317ad8aec1d4ad7b5eae901 The rollback procedure was completed a few minutes ago, and we have verified that we are back to normal operations. Our initial investigation shows that the v1 endpoint was affected, while the v2 endpoint was not. We will get back to you with more details once we have investigated the scope and cause of the problem. Ingest API returns errors in some cases https://www.enterspeedstatus.com/incident/543102 Thu, 10 Apr 2025 11:50:00 -0000 https://www.enterspeedstatus.com/incident/543102#2b612d48c24bd543d9086213c0c3c18ea6ad8ef7950b6be5bbbae85f636b25d6 The errors appears to be related to a recent deployment and we have initiated the rollback procedure. We will report back with when the normal operations of the Ingest API is restored. Ingest API returns errors in some cases https://www.enterspeedstatus.com/incident/543102 Thu, 10 Apr 2025 11:42:00 -0000 https://www.enterspeedstatus.com/incident/543102#410fc16b76d85a589335231a85f6d2faf8519ff34465ce096aa948ea019df2df We have seen an increase of failed ingests on the Ingest API. We are currently investigating. Ingest API not available https://www.enterspeedstatus.com/incident/533594 Mon, 24 Mar 2025 10:35:00 -0000 https://www.enterspeedstatus.com/incident/533594#d506d06f0a7dbdfa82fa71fde02cca646299fc20e945ec4f255f870dfa61961e The Ingest API has recovered. During a routine deployment, the Enterspeed Ingest API did not deploy correctly. The Ingest API service was then manually restarted and recovered after approximately five minutes of downtime. The error appeared without any warnings from deployments to development and testing environments. We will review the deployment process for any changes that can minimise the likelihood of a similar problem. If you have any questions, please reach out on your support Slack channel or via support@enterspeed.com. Ingest API not available https://www.enterspeedstatus.com/incident/533594 Mon, 24 Mar 2025 10:28:00 -0000 https://www.enterspeedstatus.com/incident/533594#c032bcc44b46d75b27ea50ef101286e3b33b7bb64434e364036463f6448be402 The Enterspeed Ingest API is currently unavailable. We are investigating. Destinations are currently experiencing issues https://www.enterspeedstatus.com/incident/429922 Mon, 16 Sep 2024 13:07:00 -0000 https://www.enterspeedstatus.com/incident/429922#9bab8b9cfb9b141eb64f7da139978eecc7b3de12925a8121aeeb2fd59031cd12 We have now successfully moved to a new region and the backlog of missing destination messages have been emptied. We are sorry for the inconvenience caused for this destination outage. We are awaiting a expiation from our cloud service provider and will based on that evaluate what can be done to prevent similar problems in the future. Destinations are currently experiencing issues https://www.enterspeedstatus.com/incident/429922 Mon, 16 Sep 2024 11:56:00 -0000 https://www.enterspeedstatus.com/incident/429922#fd3df2307d7805faaf9df4f8621163a89754b5205ee1fb2cf22079849a01e95d We are now in the final stages of testing our workaround for moving the faulty infrastructure to a new region. Hopefully the next update will be a positive one. Also note that all missing events will be processed from the backlog, so all data will eventually we updated to the various destinations. Destinations are currently experiencing issues https://www.enterspeedstatus.com/incident/429922 Mon, 16 Sep 2024 10:26:00 -0000 https://www.enterspeedstatus.com/incident/429922#1f1874cd7f1553643cf184ff441b7792b1b901205e25421c71ccc38ada958abb We have confirmed that all Destinations currently are not sending events. We have narrowed the problem down to an issue with a specific combination of .NET version, type of deployment and region. This is an issue with our cloud service provider and we are in the proces of changing region to restore the destinations as soon as possible. We are also reaching out to the cloud service provider's support team. Destinations are currently experiencing issues https://www.enterspeedstatus.com/incident/429922 Mon, 16 Sep 2024 10:05:00 -0000 https://www.enterspeedstatus.com/incident/429922#df996645a69c91761c2bf3fe0cbb8ac2f8784e37d93b99ddc2e42ec7d5d386a9 Our monitoring has alerted us on errors on our destinations feature. This causes events not to be send for webhooks, Algolia, Relewise and other of our Destinations. We are currently investigating and will create a new update when we have a better understanding of the issue. Ingest API suffers from very low volume of intermittant errors https://www.enterspeedstatus.com/incident/418658 Thu, 29 Aug 2024 06:56:00 -0000 https://www.enterspeedstatus.com/incident/418658#182d269257989ae3ed66451bf8466aa81ac6dcf71e333fae8a7d7b32c44092a1 After observing the metrics after the implemented workaround yesterday afternoon, we can now resolve the Ingest API degradation issue. We will update with a full post mortem when we have debriefed with our cloud service provider. Sorry for the inconvenience and please reach out if you have any questions. Ingest API suffers from very low volume of intermittant errors https://www.enterspeedstatus.com/incident/418658 Wed, 28 Aug 2024 19:48:00 -0000 https://www.enterspeedstatus.com/incident/418658#0e89b1926fc83cd024c2aabdbe31490962510a03819f4699476d9d03ce5ee5cf Our monitoring systems shows that we have successfully mitigated the issue. We will continue to monitor and engage with our cloud service provider's support team and investigate the cause of the issue. We will keep the issue open until we are certain that no further issues affect our customers. Ingest API suffers from very low volume of intermittant errors https://www.enterspeedstatus.com/incident/418658 Wed, 28 Aug 2024 16:10:00 -0000 https://www.enterspeedstatus.com/incident/418658#be24bbcc1b8e686ab3eaf43c49066211c2874f4b46867d2b944de6fb630be442 The first results of our workaround seem to be working and effectively eliminating the 502 errors. We will continue to monitor and evaluate the next steps. Ingest API suffers from very low volume of intermittant errors https://www.enterspeedstatus.com/incident/418658 Wed, 28 Aug 2024 13:39:00 -0000 https://www.enterspeedstatus.com/incident/418658#7d0bc97b044dae31f8252c91e83b8c4cd04dcabb62c46acbeb477635296610b9 Currently, we are working on two different strategies. First, we are trying a workaround using a different underlying resource type which we are monitoring the effects of now. Second, our cloud service provider has identified potential internal issues, which have now been escalated to their internal team. We will keep you updated as we learn more. Ingest API suffers from very low volume of intermittant errors https://www.enterspeedstatus.com/incident/418658 Wed, 28 Aug 2024 10:23:00 -0000 https://www.enterspeedstatus.com/incident/418658#9ebdc8bdbcf3ea5cb4a9d954541d94fc8d71aa3dea9d198324ae4a4c03d20af6 Since the last update we have continued to rule things out and monitor the effects. We are working together with our cloud service provider's support team that are actively investigating the cause of the issue. Ingest API suffers from very low volume of intermittant errors https://www.enterspeedstatus.com/incident/418658 Tue, 27 Aug 2024 10:13:00 -0000 https://www.enterspeedstatus.com/incident/418658#5001dfb9f97ae3c6d02ca347865950f51b0002bb87be918f96a221528ee36007 We continue to walk through the debugging and mitigation steps, but with the intermittent nature of this issue we are not able to determine the effects before a 2-12 hour periode has passed. Please reach out on your Slack support channel or via support@enterspeed.com if you have any questions. Ingest API suffers from very low volume of intermittant errors https://www.enterspeedstatus.com/incident/418658 Mon, 26 Aug 2024 11:40:00 -0000 https://www.enterspeedstatus.com/incident/418658#5627e12f0f0350a3d3b85f5126dff241b1e8817b25197fdadeb82d0e31b0ab10 We are trying out various different debugging and mitigation strategies. It is proving to be a slow process as the issue is happening at random and with up to 12 hours in between. This off course has a big negative impact on the feedback cycle for our various initiatives. We are working together with our cloud service providers support team to understand and mitigate the issue. Ingest API suffers from very low volume of intermittant errors https://www.enterspeedstatus.com/incident/418658 Sun, 25 Aug 2024 17:55:00 -0000 https://www.enterspeedstatus.com/incident/418658#73ff5e21f6a97b10dbaef9a5a8dd985c3e2009a25e3a207fb3bda146b9b72e7b We are continuing to monitor the situation and will update this page as we learn more. Ingest API suffers from very low volume of intermittant errors https://www.enterspeedstatus.com/incident/418658 Sat, 24 Aug 2024 17:29:00 -0000 https://www.enterspeedstatus.com/incident/418658#51b4f189e1ec8b5d4d2b9375e2187b00be06be60f626fc4c592f11ea9ed1fe09 We are continuing to monitor the situation and the errors reported are still very low. We will update as we learn more. Ingest API suffers from very low volume of intermittant errors https://www.enterspeedstatus.com/incident/418658 Fri, 23 Aug 2024 00:36:00 -0000 https://www.enterspeedstatus.com/incident/418658#5fe33da3b3b65d50274d8753e2aace75bb98ad0d40a3365b0b5fc20642142d06 Our monitoring systems are reporting a very number of Ingest API errors. We have many thousands of successful requests every hour, but an estimate 0,1 % of Ingest API requests returns a 502. Our current analyses point to an intermittent failure in the our cloud partner's load balancer and we are working together with their support to further diagnose the issue. As always we recommend to use a retry strategy when ingesting data into Enterspeed to protect against intermittent network issues. We apologise for any inconvenience and please reach out via your Slack support channel or via email if you have any questions. South East Asia Delivery Region unresponsive https://www.enterspeedstatus.com/incident/408731 Sat, 03 Aug 2024 06:29:00 -0000 https://www.enterspeedstatus.com/incident/408731#7b23f14afa6f9f70f8ffec1f09e95950445013a123ea85ac8585ec9675843a6f Recovered after Azure returned to normal operations. South East Asia Delivery Region unresponsive https://www.enterspeedstatus.com/incident/408731 Sat, 03 Aug 2024 06:13:00 -0000 https://www.enterspeedstatus.com/incident/408731#4f5637e0900031849857f3e62d871a0fce0537a3a25bad229fa9d4caca146640 The monitor for sea.delivery.enterspeed.com/health went down due to a timeout, receiving no headers. The incident affected monitoring from Asia, North America, and Europe. After experiencing the timeouts, the monitors recovered automatically within a short period, and the incident was resolved. An Azure network issue can prevent login to Management App https://www.enterspeedstatus.com/incident/406223 Wed, 31 Jul 2024 06:25:00 -0000 https://www.enterspeedstatus.com/incident/406223#2d9e6dad51bb006b81b02597eab1646767ace93466e268b4a7861c8eb5ee4373 During the night Microsoft have declared the incident mitigated and there should be no further impact on Enterspeed services. If you experience any problems or have any questions please write to us on your Slack support channel or via support@enterspeed.com. An Azure network issue can prevent login to Management App https://www.enterspeedstatus.com/incident/406223 Tue, 30 Jul 2024 16:10:00 -0000 https://www.enterspeedstatus.com/incident/406223#e04d5e12ef1d326caf42c81778acce5285b65b6ab973e1c3c504d8ce7e1e5bec Currently Microsoft Azure are experiencing a networking issue that affects the logon functionality of the Enterspeed Management App. We are monitoring the issue and will update this status page when Microsoft have resolved the issue. We recommend our customers to follow updates on: https://azure.status.microsoft/en-gb/status Scheduled service window with expected impact for Ingest API https://www.enterspeedstatus.com/incident/392841 Thu, 04 Jul 2024 06:44:20 -0000 https://www.enterspeedstatus.com/incident/392841#aaea3a7078b1e60e8e5952fdaabe5dbfcd4de8cd87014db5cb86e250912250ad Please be advised that there will be a scheduled service window today the 3rd of July from 23:00 to 24:00 CET to complete a failed deployment initiated yesterday. We apologise for the late notice, but this step is essential to progress our work. **Impact Details** * Ingest API and Management App (API): ~1 minute of expected downtime. * Delivery API: No impact anticipated. Thank you for your understanding and as always reach out on your Slack support channel or via support@enterpseed.com if you have any questions. Scheduled service window with expected impact for Ingest API https://www.enterspeedstatus.com/incident/392841 Thu, 04 Jul 2024 06:44:20 -0000 https://www.enterspeedstatus.com/incident/392841#aaea3a7078b1e60e8e5952fdaabe5dbfcd4de8cd87014db5cb86e250912250ad Please be advised that there will be a scheduled service window today the 3rd of July from 23:00 to 24:00 CET to complete a failed deployment initiated yesterday. We apologise for the late notice, but this step is essential to progress our work. **Impact Details** * Ingest API and Management App (API): ~1 minute of expected downtime. * Delivery API: No impact anticipated. Thank you for your understanding and as always reach out on your Slack support channel or via support@enterpseed.com if you have any questions. Scheduled service window with expected impact for Ingest API https://www.enterspeedstatus.com/incident/392841 Wed, 03 Jul 2024 22:00:00 +0000 https://www.enterspeedstatus.com/incident/392841#ebe559379dd916fc0c4ad01210ed5170c491b995d4d08a647e2587b5c3cfcd61 Maintenance completed Scheduled service window with expected impact for Ingest API https://www.enterspeedstatus.com/incident/392841 Wed, 03 Jul 2024 22:00:00 +0000 https://www.enterspeedstatus.com/incident/392841#ebe559379dd916fc0c4ad01210ed5170c491b995d4d08a647e2587b5c3cfcd61 Maintenance completed Scheduled service window with expected impact for Ingest API https://www.enterspeedstatus.com/incident/392841 Wed, 03 Jul 2024 21:00:00 -0000 https://www.enterspeedstatus.com/incident/392841#aaea3a7078b1e60e8e5952fdaabe5dbfcd4de8cd87014db5cb86e250912250ad Please be advised that there will be a scheduled service window today the 3rd of July from 23:00 to 24:00 CET to complete a failed deployment initiated yesterday. We apologise for the late notice, but this step is essential to progress our work. **Impact Details** * Ingest API and Management App (API): ~1 minute of expected downtime. * Delivery API: No impact anticipated. Thank you for your understanding and as always reach out on your Slack support channel or via support@enterpseed.com if you have any questions. Scheduled service window with expected impact for Ingest API https://www.enterspeedstatus.com/incident/392841 Wed, 03 Jul 2024 21:00:00 -0000 https://www.enterspeedstatus.com/incident/392841#aaea3a7078b1e60e8e5952fdaabe5dbfcd4de8cd87014db5cb86e250912250ad Please be advised that there will be a scheduled service window today the 3rd of July from 23:00 to 24:00 CET to complete a failed deployment initiated yesterday. We apologise for the late notice, but this step is essential to progress our work. **Impact Details** * Ingest API and Management App (API): ~1 minute of expected downtime. * Delivery API: No impact anticipated. Thank you for your understanding and as always reach out on your Slack support channel or via support@enterpseed.com if you have any questions. Ingest API and Management App are experiencing problems https://www.enterspeedstatus.com/incident/392459 Tue, 02 Jul 2024 11:34:00 -0000 https://www.enterspeedstatus.com/incident/392459#5ba33b58537d60666efdee34ecdcc1fa286428cbb19ebc076fbf4932a9c9f576 All services are now back online and fully restored. The Delivery API was not affected by this incident. What we know so far, is that a routine deployment, proved not to be business as usual. Despite being peer reviewed, manual and automated tested on test environments our upgrade of a key component broke when applied in production. We will further investigate why this was the case. We are very sorry for the inconvenience, and standby to answer any questions either on your Slack support channel or via support@enterspeed.com. Ingest API and Management App are experiencing problems https://www.enterspeedstatus.com/incident/392459 Tue, 02 Jul 2024 11:34:00 -0000 https://www.enterspeedstatus.com/incident/392459#5ba33b58537d60666efdee34ecdcc1fa286428cbb19ebc076fbf4932a9c9f576 All services are now back online and fully restored. The Delivery API was not affected by this incident. What we know so far, is that a routine deployment, proved not to be business as usual. Despite being peer reviewed, manual and automated tested on test environments our upgrade of a key component broke when applied in production. We will further investigate why this was the case. We are very sorry for the inconvenience, and standby to answer any questions either on your Slack support channel or via support@enterspeed.com. Ingest API and Management App are experiencing problems https://www.enterspeedstatus.com/incident/392459 Tue, 02 Jul 2024 11:31:00 -0000 https://www.enterspeedstatus.com/incident/392459#dbec7c3521b23f20e3e99c15da77dadd75dbf03e62b0f33088086d53609673c7 Management App is back online at 13.29 CET. Ingest API and Management App are experiencing problems https://www.enterspeedstatus.com/incident/392459 Tue, 02 Jul 2024 11:24:00 -0000 https://www.enterspeedstatus.com/incident/392459#e05a0a77ccb0709e49a3bbd453399e8227b7458c7406fef53a28d4e08dfbf283 Ingest API is back online at 13.22 CET. Ingest API and Management App are experiencing problems https://www.enterspeedstatus.com/incident/392459 Tue, 02 Jul 2024 11:24:00 -0000 https://www.enterspeedstatus.com/incident/392459#e05a0a77ccb0709e49a3bbd453399e8227b7458c7406fef53a28d4e08dfbf283 Ingest API is back online at 13.22 CET. Ingest API and Management App are experiencing problems https://www.enterspeedstatus.com/incident/392459 Tue, 02 Jul 2024 11:18:00 -0000 https://www.enterspeedstatus.com/incident/392459#e23b524530a8c5bef07daf47569960f049b03f402b478deb988196f192c53d47 Currently our Ingest API and Management App are reporting errors. We will update you as we know more. Ingest API and Management App are experiencing problems https://www.enterspeedstatus.com/incident/392459 Tue, 02 Jul 2024 11:18:00 -0000 https://www.enterspeedstatus.com/incident/392459#e23b524530a8c5bef07daf47569960f049b03f402b478deb988196f192c53d47 Currently our Ingest API and Management App are reporting errors. We will update you as we know more. Intermediate errors in processing https://www.enterspeedstatus.com/incident/366021 Wed, 08 May 2024 14:00:00 -0000 https://www.enterspeedstatus.com/incident/366021#f4b9f68018abd92d0e7b109b5fd262cfa9265c6a974b202e3bd1a5be1d905913 **Enterspeed Processing Post-Mortem: 7th May 2024** On the 7th of May 2024, an incident occurred that resulted in some views not being generated for users. This impacted their ability to access updates from their CMS or other sources on their websites. The incident stemmed from a bug introduced during a change aimed at providing more insight into the jobs queue on the 6th of May. Notably, Enterspeed's Ingest and Delivery API remained unaffected. If you have been affected by this bug, we recommend deploying all schemas. Ingesting the affected source entities will only have effect if the source entities’ data is changed from the already ingested source entity. Additionally, feel free to reach out on your Slack support channel or via support@enterspeed.com for assistance. **What happened?** Around noon on the 7th of May, reports surfaced from users stating that some newly ingested source entities were not processed, leading to missing views. Although initially described with uncertainty, a clear pattern emerged from these reports. At 13:42 CET, we officially declared an incident on our status page. The engineering team promptly convened to identify and resolve the bug. Initial investigations, including log reviews and monitoring, showed no logged errors, and processing appeared functional. Subsequent review of recent deployments revealed a change to our fairness queue, which triggered the observation that our deduplication feature was removing an excessive number of processing jobs. We then focused on this part of the system. Despite no glaring issues found in the code upon review, we were able to reproduce the bug around 15:30 CET, identifying a pattern where the issue occurred only during simultaneous processing of multiple environments. By 18:41 CET, a fix was deployed to production, restoring normalcy to the deduplication monitoring graph. **Root Cause** For an incident like this to happen multiple steps needs to fail, including peer reviews, manual testing, and automated checks. Neither the engineer nor the code reviewer fully grasped the consequences of the code change. Despite relying on automated testing, the scenario leading to the bug was not covered. The root cause was attributed to human error in misunderstanding how the deduplication feature functions. **Lessons Learned** Key takeaways include the need to enhance automated testing to cover such scenarios and consider expanding logging for better error identification. Plans are in place to add more environment-based rules to the fairness queue, incorporating these lessons into the platform moving forward. In the short term, we will expand our automated tests to cover similar scenarios. **Final Words** We recognise the severity of this incident and acknowledge its impact on user trust. We apologise for any inconvenience caused and are committed to avoiding similar issues in the future. For any questions or concerns, please reach out on your Slack support channel or via support@enterspeed.com. You can learn more about the [technical details of our fairness queue in our blog post](https://www.enterspeed.com/blog/creating-fairness-in-a-multi-tenant-setup). Intermediate errors in processing https://www.enterspeedstatus.com/incident/366021 Tue, 07 May 2024 18:14:00 -0000 https://www.enterspeedstatus.com/incident/366021#cae443f89300715533d1252302da8733d503e602261a65626631c9c9680ca920 The fix continue to show the expected results. We will monitor the platform and update this page if anything changes. Please reach out on you Slack support channel or via support@enterspeed.com if you have any questions. Intermediate errors in processing https://www.enterspeedstatus.com/incident/366021 Tue, 07 May 2024 16:48:00 -0000 https://www.enterspeedstatus.com/incident/366021#a98366d4251c958f1443554cd22d50f637aac69cf39a37b905887c327b8a95c0 We have deployed a fix and we will continue monitor the platform to observe that the fix has the desired outcome. Intermediate errors in processing https://www.enterspeedstatus.com/incident/366021 Tue, 07 May 2024 13:49:00 -0000 https://www.enterspeedstatus.com/incident/366021#0ab747aa445158c809b2b51cebbf2787785b7d726be451469de35d65c59f1c16 We have identified the source of the problem and we are now working to restore 100 % processing. We have found an issue when multiple environments have jobs in the processing queue at the same time, so that could be a thing to avoid if you are experiencing any issues at the moment. Next update at 19.00 CET at the latest. Intermediate errors in processing https://www.enterspeedstatus.com/incident/366021 Tue, 07 May 2024 12:59:00 -0000 https://www.enterspeedstatus.com/incident/366021#160628d3ce53198e6dee2d3d6eff51cac3bba9901d27a73c1562f4040c973678 We are still investigating the root cause of the issue. We have identified that a limited number of view processing jobs is being wrongfully discarded. Intermediate errors in processing https://www.enterspeedstatus.com/incident/366021 Tue, 07 May 2024 11:42:00 -0000 https://www.enterspeedstatus.com/incident/366021#c3ec41363528fcc419da092116e0f05362e0b1f152ebe5c778764eb6902e00ea We are currently investigating reports of missing view generations in the processing layer. We will update this page as we know more. Unresponsive South East Asia Delivery Region https://www.enterspeedstatus.com/incident/349438 Sun, 31 Mar 2024 15:12:00 -0000 https://www.enterspeedstatus.com/incident/349438#fbee057903a21219662a5f20fcb11a613b35f2a54e66078e33256e2dc536640f The South East Asia Delivery region self recovered after approximately 8 minutes. Unresponsive South East Asia Delivery Region https://www.enterspeedstatus.com/incident/349438 Sun, 31 Mar 2024 14:56:00 -0000 https://www.enterspeedstatus.com/incident/349438#5779082926c5a86664260afed6f30ee01dc1f695b4131296c27909b8f7b40ead Our monitoring detected an unresponsive Delivery API region in South East Asia. Unresponsive South East Asia Delivery API region https://www.enterspeedstatus.com/incident/342288 Mon, 18 Mar 2024 04:05:00 -0000 https://www.enterspeedstatus.com/incident/342288#39df5b6126d675c44b92fb5962adf7487189467a9e52d24e46ef7844ae632a72 The South East Asia Delivery region self recovered after approximately 8 minutes. Unresponsive South East Asia Delivery API region https://www.enterspeedstatus.com/incident/342288 Mon, 18 Mar 2024 03:57:00 -0000 https://www.enterspeedstatus.com/incident/342288#4e01a90f1961b9d341ffe43dd405fd2db2c9f199b21f94d3d61ca42865aebf22 Our monitoring detected an unresponsive Delivery API region in South East Asia. South East Asia Delivery API region not available https://www.enterspeedstatus.com/incident/340193 Wed, 13 Mar 2024 01:10:00 -0000 https://www.enterspeedstatus.com/incident/340193#dc690e70557d07e0643d9504f9012dafbe43d8376d5ed7d18e58d0c2d0680697 The South East Asia Delivery region self recovered after approximately 5 minutes. South East Asia Delivery API region not available https://www.enterspeedstatus.com/incident/340193 Wed, 13 Mar 2024 01:04:00 -0000 https://www.enterspeedstatus.com/incident/340193#b793518d7cac27dd8f0d1f8a3687f975fd0f34826c1e6cf07dfd53f56ccd8887 Our monitoring detected an unresponsive Delivery API region in South East Asia. Destination events not send https://www.enterspeedstatus.com/incident/336452 Tue, 05 Mar 2024 07:38:00 -0000 https://www.enterspeedstatus.com/incident/336452#9963ce571ce97a74c6bcb20bfaa896f0ea2f30e873eea7ff6bb880372d0ff989 Our engineering team got to work on the issue first thing in the morning and resolved it at 08:38 AM CET. The approximately 400 affected events were then processed from the queue, and all events were sent to the various destinations. Our investigation shows that our cloud provider is performing scheduled maintenance, and we believe the cause of the outage was due to this scheduled maintenance. See more details: https://app.azure.com/h/QML1-FCG/a45f7f We apologise for the inconvenience and encourage you to reach out on your Slack support channel or via support@enterspeed.com if you have any questions. Destination events not send https://www.enterspeedstatus.com/incident/336452 Tue, 05 Mar 2024 02:01:00 -0000 https://www.enterspeedstatus.com/incident/336452#23846940d6745408d45ef46cc57d11971f520a88360a15886442ae6914724191 At 03.01 AM CET our internal monitoring reported that destination events was not send. This affected all destinations such as Webhooks and Algolia. Ingest API not available https://www.enterspeedstatus.com/incident/336437 Tue, 05 Mar 2024 01:03:00 -0000 https://www.enterspeedstatus.com/incident/336437#522de120ffcded58df0fc379c1eb0323ba70d3a5d91fcd16c01ddb6fd7a4e5e6 At 02:03 AM CET AM CET the Ingest API became available again. Our investigation shows that our cloud provider is performing scheduled maintenance, and we believe the cause of the outage was due to this scheduled maintenance. See more details: https://app.azure.com/h/QML1-FCG/a45f7f We apologise for the inconvenience and encourage you to reach out on your Slack support channel or via support@enterspeed.com if you have any questions. Ingest API not available https://www.enterspeedstatus.com/incident/336437 Tue, 05 Mar 2024 00:52:00 -0000 https://www.enterspeedstatus.com/incident/336437#407ce08138a09df1af5c50a06ee4f4c3c1c1d526f5aa1e0992c91bd52ddbe46d At 01:52 AM CET, our internal monitoring reported that our Ingest API was unavailable. Degraded performance on delivery.enterspeed.com https://www.enterspeedstatus.com/incident/220932 Fri, 16 Jun 2023 07:06:00 -0000 https://www.enterspeedstatus.com/incident/220932#0801ff7d58b7a4d78017b1585b9728802f8904cdf2cb08a4f7d3a554aabe270a Our internal monitoring have shown that our performance is back to normal as of 8.28 CET. Microsoft are also reporting that customer should begin to see things come back to normal, although they have not officially resolved the issue in their end. But from your and and Enterspeed's perspective, we are back in business. As always reach out if you have any questions. Degraded performance on delivery.enterspeed.com https://www.enterspeedstatus.com/incident/220932 Fri, 16 Jun 2023 02:31:00 -0000 https://www.enterspeedstatus.com/incident/220932#b0decd182bd00d7fbc5ad8125ca356122cf7a2ca8c44a388982b514d7cce904d We are currently experiencing degraded performance from our cloud provider. Since 4.31 CET we have seen prolonged response times from our Delivery API in our West Europe delivery region. About 0,5 % of Delivery API requests have also failed entirely between 4.31 and 8.10 CET. Microsoft is working on mitigating the issue. We will update you when we now more from Microsoft. You can following along on Azure's status page: https://azure.status.microsoft/en-us/status It may be an option for you to use our South East Asia Delivery Region: sea.delivery.enterspeed.com Please contact us at support@enterspeed.com or via your Slack support channel if you have any questions. South East Asia Delivery API Region is unavailable https://www.enterspeedstatus.com/incident/173271 Thu, 09 Feb 2023 08:01:00 -0000 https://www.enterspeedstatus.com/incident/173271#8aad1aed4b247829ec07e56e309e0a5e5a71e858df782141242bc6489e1abb47 It was now possible to restart the South East Asia Delivery region after the Azure incident affected this region. During the incident a queue of views for the SEA region has built up, this queue should be cleared in a few minutes. Please reach out to support@enterspeed.com if you have any questions. South East Asia Delivery API Region is unavailable https://www.enterspeedstatus.com/incident/173271 Thu, 09 Feb 2023 05:18:00 -0000 https://www.enterspeedstatus.com/incident/173271#0bc9ede79d6a315d094d824fdb37ec28d9047ac27dcf833bc87e04f16c81f168 The South East Asia Delivery Region is currently unavailable. Microsoft Azure is currently experiencing an issue that are affecting our SEA Delivery API. Any customers using the sea.delivery.enterspeed.com endpoint can use the delivery.enterspeed.com while this incident is ongoing. We will update as we know more. Intermittent issue with cloud provider https://www.enterspeedstatus.com/incident/168073 Wed, 25 Jan 2023 11:02:00 -0000 https://www.enterspeedstatus.com/incident/168073#66859cbd88a519029831a51b87a77d760bd81e0027dd91e3fb17c26dc41a4da5 Microsoft has now officially resolved the incident. Intermittent issue with cloud provider https://www.enterspeedstatus.com/incident/168073 Wed, 25 Jan 2023 11:02:00 -0000 https://www.enterspeedstatus.com/incident/168073#66859cbd88a519029831a51b87a77d760bd81e0027dd91e3fb17c26dc41a4da5 Microsoft has now officially resolved the incident. Intermittent issue with cloud provider https://www.enterspeedstatus.com/incident/168073 Wed, 25 Jan 2023 10:28:00 -0000 https://www.enterspeedstatus.com/incident/168073#eb20fa4d09a864e795f546be6a86e974d0f13e0caf4ba52ab65aff2530c3e3df Microsoft has update with their latest status: "We have identified a recent change to WAN as the underlying cause, and have taken steps to roll back this change. Our telemetry shows consistent signs of recovery from 09:00 UTC onwards across multiple regions and services, and we are continuing to actively monitor the situation." Intermittent issue with cloud provider https://www.enterspeedstatus.com/incident/168073 Wed, 25 Jan 2023 10:28:00 -0000 https://www.enterspeedstatus.com/incident/168073#eb20fa4d09a864e795f546be6a86e974d0f13e0caf4ba52ab65aff2530c3e3df Microsoft has update with their latest status: "We have identified a recent change to WAN as the underlying cause, and have taken steps to roll back this change. Our telemetry shows consistent signs of recovery from 09:00 UTC onwards across multiple regions and services, and we are continuing to actively monitor the situation." Intermittent issue with cloud provider https://www.enterspeedstatus.com/incident/168073 Wed, 25 Jan 2023 08:58:00 -0000 https://www.enterspeedstatus.com/incident/168073#10d097eeb1c7d0a70b0be408d5edf5742caca2b1051e6a80678e4b0f35674aee Microsoft has acknowledged a global networking issue from 08:05 CET. As of now all of our services is functioning. We are monitoring the situation. Intermittent issue with cloud provider https://www.enterspeedstatus.com/incident/168073 Wed, 25 Jan 2023 08:58:00 -0000 https://www.enterspeedstatus.com/incident/168073#10d097eeb1c7d0a70b0be408d5edf5742caca2b1051e6a80678e4b0f35674aee Microsoft has acknowledged a global networking issue from 08:05 CET. As of now all of our services is functioning. We are monitoring the situation. Intermittent issue with cloud provider https://www.enterspeedstatus.com/incident/168073 Wed, 25 Jan 2023 08:27:00 -0000 https://www.enterspeedstatus.com/incident/168073#7c4bce0a05fdafff8e25433cc47f7b086655d402d0f93e53f2516a0d3ebe4da5 Our monitoring is picking up intermittent issues with both Delivery API and Ingest API. This appears to be related with our cloud provider Microsoft Azure investigating issues on their platform. Our team is monitoring the situation. Intermittent issue with cloud provider https://www.enterspeedstatus.com/incident/168073 Wed, 25 Jan 2023 08:27:00 -0000 https://www.enterspeedstatus.com/incident/168073#7c4bce0a05fdafff8e25433cc47f7b086655d402d0f93e53f2516a0d3ebe4da5 Our monitoring is picking up intermittent issues with both Delivery API and Ingest API. This appears to be related with our cloud provider Microsoft Azure investigating issues on their platform. Our team is monitoring the situation. Delivery API data is delayed https://www.enterspeedstatus.com/incident/150262 Thu, 05 Jan 2023 20:31:00 -0000 https://www.enterspeedstatus.com/incident/150262#9640b3bc0a933319fc38058e8baeee783c267afc32122823bcc4718de8f74cd0 We want to follow up on Tuesday's incident. Our investigations have shown that multiple events were the cause of the incident. The first event was a series of schema deploys on a large tenant. When working with large data sets it is not uncommon for a schema deployment to take a few minutes, and in some cases, 10-20 minutes is to be expected. In this particular case, a total of 15 schemas were deployed, and each needed a few minutes. At some point, the specific developer didn't see the expected views in the Delivery API and the developer decided to re-deploy some of the schemas. Normally this is not a cause for concern, as the processing layer simply works through the queue of jobs. But one of the schemas with multiple deploys triggered the processing of a particular source entity type, with a high number, of large source entities. This number of entities combined with the size of each entity triggered a bug in our code. And when the added impact of multiple schema deploys, the bug caused our processing layer to be locked up and essentially stopped processing. We reacted by deploying a new version of the platform that had a "kill switch" feature, that can clear out messages for a particular tenant. This had an immediate effect and stopped the processing queue from growing. Unfortunately, the new version had a performance regression bug that significantly reduced the processing capabilities. We then deployed yet a new version of the platform with the "kill switch", but without the performance regression bug. Then we were quickly able to clear out the queue. We want to underline that the incident is not the fault of the developer triggering multiple deploys, it might have contributed to the size of the problem, but it was not the root cause of the problem. We still have some unanswered questions, especially why this bug just now caused the processing layer to become backlogged? Both the buggy code and the schema version with the specific source entity type has, so far as our investigation has shown, been in the platform for multiple months without triggering the processing layer to become backlogged. Our main theory is that other optimisations have generally allowed the processing layer to process more, faster, and therefore overall increasing the load. But we have chosen not to investigate further. We have learned enough to start working on initiatives to make sure that a similar incident can't happen again. This incident shares some of the characteristics of the previous incident in December. It is beyond frustrating to again experience the processing layer is backlogged, and it takes hours to clear the queue. The cause of the problem was different, but the symptoms and effects for our uses were the same. We are glad to see that one of the actions from the last incident proved to help mitigate the situation, but we are not satisfied with the time it took to deploy a functioning version. The first action is of course to fix the bug that was triggered in this incident. Secondly, we continue to work on mitigating actions around the isolation of a tenant's processing queues, detection of potential cascading events and detection of potential "over processing". Furthermore, we are also working on providing more feedback for the developers using Enterspeed, so they can better understand the processing timings. The performance regression bug introduced in the first version of the kill switch also bears mentioning. We deployed without performance tests had been done. We had done our code reviews, and automated and manual tests, so our confidence was high. But despite our confidence in the code, a performance regression bug was deployed. This leads us to improve our load performance test setup. With the experience of this incident, we can conclude that setting up these tests requires too much manual work, and therefore they are done only when we work directly with performance-critical areas. This leaves too much room for regression bugs in seemingly unrelated areas. Delivery API data is delayed https://www.enterspeedstatus.com/incident/150262 Tue, 03 Jan 2023 14:34:00 -0000 https://www.enterspeedstatus.com/incident/150262#ad83e1a7198f7c164014fc4c8e5a24c4b177880d497a400b1c782c79ec328d20 The queue is now completely processed and we are back to normal processing times. We are very sorry for the delays. We are still working through the data to understand what triggered this. We will post a full incident report when we know more. Delivery API data is delayed https://www.enterspeedstatus.com/incident/150262 Tue, 03 Jan 2023 14:25:00 -0000 https://www.enterspeedstatus.com/incident/150262#f9d7a2a322fd86f9267199d7c7ba3b706c97b6217c909ff2b7dcb4ed340fbafb We have now deployed a fix to the kill switch and are now seeing the expected performance. We expect the queue to be processed in ~10 minutes (15.35 CET). We will update again when the processing is done. Delivery API data is delayed https://www.enterspeedstatus.com/incident/150262 Tue, 03 Jan 2023 12:44:00 -0000 https://www.enterspeedstatus.com/incident/150262#f3f40cf2575fed3b3c8d0ee1e408af7a75ea57cccf9f5776e7a7829594eb2581 As previously reported we have triggered our tenant "kill switch" on the specific tenant where this issue is occurring. The newly introduced kill switch is not performaning the purging as fast as we expected. Our current estimate is for sometime between 19.00 and 20.00 CET. We are currently working on significantly improving the performance of the kill switch. Please remember that we are prioritising data consistency for all other tenants, we would rather live with some delays, than risk data inconsistencies. Delivery API data is delayed https://www.enterspeedstatus.com/incident/150262 Tue, 03 Jan 2023 10:26:00 -0000 https://www.enterspeedstatus.com/incident/150262#0de4b7dcdede224492126a141c67d59c551a721879705ea8238a22f85c731f4f We have identified the issue and isolated the issue to a single tenant. We are now working to remove the backlogged messages for this specific tenant. Delivery API data is delayed https://www.enterspeedstatus.com/incident/150262 Tue, 03 Jan 2023 08:34:00 -0000 https://www.enterspeedstatus.com/incident/150262#f796b8270053ad1df2297e5f7646a6502ac772d9020f80276655f52b88aa8478 Our monitoring has picked up on delays in our processing layer. We are currently investigating the issue. Delivery API data is delayed https://www.enterspeedstatus.com/incident/145044 Thu, 15 Dec 2022 09:12:00 -0000 https://www.enterspeedstatus.com/incident/145044#2935cb733d0225ace855208f2ff4ce64a6ada4b3753fb6608e18fe453ca6ee5b We want to follow up on yesterday's incident. This was unlike anything we have ever seen before. The trigger of the incident was a customer performing a re-seed on a medium to large-sized tenant. Especially one of the schemas was crafted very, unfortunately. Almost all source entities would trigger the schema and because the schema had an action to trigger its parent entity, that cascaded into millions of view generations in just 2 hours. A lot of the view generation was in this scenario unnecessary, but due to the distributed and horizontal scaleability of our processing layer, we can't easily detect when a view will be superseded by a newer view, just seconds later. We have accepted a level of "over processing", as we have regarded this as the most effective strategy. During the day we tweaked our processing capabilities to more than 3 times the speed of Tuesday. It was not possible to purge the queue without compromising the data consistency. We had to take the decision that provided the safest possible way back to normal. This incident, unfortunately, caused a delay for all of our customers yesterday. This delay is unsatisfying for us and the service we want to deliver to our customers. We have therefore planned the following mitigating actions: Improved monitoring and alarming around processing queues and processing time Isolation of tenant's processing queues Detection of potential cascading events Detection of potential "over processing" If you have any questions, then please reach out to support@enterspeed.com. Emil Rasmussen, CTO Delivery API data is delayed https://www.enterspeedstatus.com/incident/145044 Wed, 14 Dec 2022 19:39:00 -0000 https://www.enterspeedstatus.com/incident/145044#17a198850085ad512e8df31d2efa7ce26800376d08710ffea3b5f0ab7a3084ac The entire queue was at 19.31 CET processed and we are not experiencing any delays in our processing layer. We are very sorry for the delays. We have multiple both short and longterm changes ahead, so a similar situation will not happen again. Please reach out at support@enterspeed.com if you have any questions. Emil Rasmussen, CTO Delivery API data is delayed https://www.enterspeedstatus.com/incident/145044 Wed, 14 Dec 2022 18:16:00 -0000 https://www.enterspeedstatus.com/incident/145044#d72eb2f4d325fd55948deea406efce62c840b7551d94ba1c203ca62358930c09 We are on track for our previously estimated end time of 20:00 CET. Last update indicates that we will be done slightly earlier. We will update again at 20:45 CET. Delivery API data is delayed https://www.enterspeedstatus.com/incident/145044 Wed, 14 Dec 2022 16:01:00 -0000 https://www.enterspeedstatus.com/incident/145044#b97a8de6ab3c074fd292afbda4a981a599dd127ae8227f9f9d633d380a952eed We are on track for our previously estimated end time of 20:00 CET. We will update again at 19:30 CET. Delivery API data is delayed https://www.enterspeedstatus.com/incident/145044 Wed, 14 Dec 2022 14:25:00 -0000 https://www.enterspeedstatus.com/incident/145044#ecebd0d53fe9e68a6191f0456cb9ba95c33f1559c6820ba6d9853da59b867bac We are on track for our previously estimated end time of 20:00 CET. We are very sorry for the delays. We have identified the root cause, and will work towards eliminating this is the future. We will update again at 17:00 CET. Delivery API data is delayed https://www.enterspeedstatus.com/incident/145044 Wed, 14 Dec 2022 13:30:00 -0000 https://www.enterspeedstatus.com/incident/145044#bc1324d7a4d900002926804a7b4bf53e7075b7d610042a0b8730e651ee636ce3 We are on track for our previously estimated end time of 20:00 CET. We will update again at 15:30 CET. If you have any questions please reach out to us at support@enterspeed.com Delivery API data is delayed https://www.enterspeedstatus.com/incident/145044 Wed, 14 Dec 2022 11:59:00 -0000 https://www.enterspeedstatus.com/incident/145044#70bad75ba264a6b35dc2750df7d79052b87504fc2621e158e80befa29d4c9fe9 Our current estimate is that the processing queue will be done a 20.00 CET. We will report with a new update at 14.30 CET. Delivery API data is delayed https://www.enterspeedstatus.com/incident/145044 Wed, 14 Dec 2022 11:19:00 -0000 https://www.enterspeedstatus.com/incident/145044#eca2c17ce80cf1b81144f3c8a06d2d213198ff9a08ffc7065d2a155a74ea3db7 We have identified a large amount of processing jobs in our queue. Since 03.07 (AM) CET our processing layer has been working through ~8 million events per hour. (We previously reported an incorrect estimate on when all data will be processed. We are currently increasing the processing capabilities and for the next update a new estimate will be reported.) We will continue monitoring and report a new status at 13:15 CET. We are sorry for delays. Delivery API data is delayed https://www.enterspeedstatus.com/incident/145044 Wed, 14 Dec 2022 11:08:00 -0000 https://www.enterspeedstatus.com/incident/145044#9c8257d3b3aac03d43f982a0d06f6e8ffe0609ea2c7dd0f23e16afcc11d28383 The Delivery API data (views processing) are currently experiencing delays. We are currently investigating a delay in our processing layer. We will update when we have more info (no later than 12.45) Certain schemas doesn't create the expected output https://www.enterspeedstatus.com/incident/138839 Fri, 25 Nov 2022 14:49:00 -0000 https://www.enterspeedstatus.com/incident/138839#1e7dc41d53aea13f0ee2f010226ee22b03c40d5e6ae15d8f9d001ca289b221ee After yesterday's incident, we have worked to understand the cause and scope of the issue. The issue affected any tenants utilising the partial schema feature and that updated content in a roughly 90-minute period. The exact issues depended on the specific content and configuration of the customer solution. Internally we categorise this incident as level 2 "High", and when these incidents happen we are not satisfied. This type of issue should never happen, but when they do we want to understand both what happen and where our tools or processes failed. In this incident, two things happened. Firstly the actual issue causing the service to fail was a simple mistake in the deployed code. Our process with peer review of all code changes resulted in a last-minute change that was the cause of the failure. The mistake was a comparison logic mistake, that was not detected upon the subsequent review and approval. This first issue is a simple mistake, that happens every day for software engineers. This very rarely slips through to end users, as we have both manual and automated testing procedures. Due to the simplicity of the change, the engineers skipped the manual test for the last iteration of the review process. This is allowed by our process, as we have multiple levels of automated tests. And now to the second issue causing this failure. Secondly, the automated test that gives us the confidence to release often and fast did not function as intended. In, what we now see as, very unfortunate timing, two days before the failed deploy change our test suite. This resulted in the tests reporting green, while the actual test performed failed in the background. We, therefore, relied on a faulty test. We also want to note that the test suite goes through the same peer review process. This is a bit of speculation, but as the test suite was new, our collective experience is not as high as with most of our other code bases. Whatever the cause, the wrongly configured test suite was the single most important factor in this incident. We work every day to keep Enterspeed running smoothly, and yesterday we failed. We will continue to evaluate our process and improve where needed. We are very sorry for the inconvenience. If you have further questions or comments, you are always welcome to reach out to support@enterspeed.com Emil Rasmussen, CTO Certain schemas doesn't create the expected output https://www.enterspeedstatus.com/incident/138839 Thu, 24 Nov 2022 14:57:00 -0000 https://www.enterspeedstatus.com/incident/138839#3f026b5d776840ac669f07e0d5d92660a45f9a846dc974a74ad16c7433a5b6cb After we received reports of failing schemes we initiated the rollback procedure. The rollback was complete at 15:57 CET. The error was related to a planned deployment, we are investigating the exact cause of the error and under what circumstances it manifested it self. We are very sorry for the inconvenience and don't hesitate to reach out on your normal Slack support channel or via support@enterspeed.com if you any information or help with your specific tenant. We will followup with a more detailed rapport and what we are doing to avoid this in the future. Emil Rasmussen, CTO Certain schemas doesn't create the expected output https://www.enterspeedstatus.com/incident/138839 Thu, 24 Nov 2022 13:29:00 -0000 https://www.enterspeedstatus.com/incident/138839#4981d15585e993acfcbccf063a09dbde5fbe792271b27a864a6699b28e29ae8e Following a planned deployment with, we received reports that certain schemas doesn't create the expected output. Some Delivery API requests falsely returned a 404 error https://www.enterspeedstatus.com/incident/126115 Thu, 13 Oct 2022 11:46:00 -0000 https://www.enterspeedstatus.com/incident/126115#df24a9d645396e559975efe2e2abe9d3745f062da5e7c8e8b42e13eaab512de7 We are very sorry for the inconvenience caused by this incident. We understand that our Delivery API is the backbone of your website and any full or partial outages are not up to the standard we hold ourselves to. We are still working through the details of what exactly happened, but we want to let you know what we know so far. Two aspects of our process failed in this situation. The first issue is of course that we deployed a release with a serious defect. The problem was not discovered during our normal review and testing process. All changes are peer-reviewed and go through multiple layers of automated testing. While we are not yet fully aware of the scale of the issue, it is clear that a subset of Delivery API requests was affected by this issue. The issue went undetected in our automated testing due to the pattern of the affected URLs was not included in our testing. The fact this incident only affected a subset of URLs caused our automated testing to falsely approve the deployment. We are yet to identify the scale and exact pattern that was affected. --- UPDATE 14th of October 2022 Upon further investigation we have conclude that URLs matching either / or https://www.eaxmple.net/ was affected, all other URLs (i. e. /about, https://www.eaxmple.net/about etc.) was not affected by this incident. --- Secondly, our internal monitoring didn't pick up on the problem and notified us of the issue. We suspect that the root cause of the issue is a bug in how we validate URLs. The problem manifested itself as returning the wrong "meta HTTP status code", but from our internal monitoring system's perspective, all Delivery API requests were successfully delivered (HTTP status code 200). But from the requesting frontend's perspective, they received a 404. So from our perspective, everything seemed to be working as intended, but the clients didn't get the data they expected. We will work on improving both our automated tests and internal monitoring to make sure a similar problem can't be released at all and furthermore that we will get alerted in case of a similar issue. We are very sorry for the inconvenience caused by this incident and if you have any questions please reach out on support@enterspeed.com Emil Rasmussen CTO Some Delivery API requests falsely returned a 404 error https://www.enterspeedstatus.com/incident/126115 Thu, 13 Oct 2022 11:03:00 -0000 https://www.enterspeedstatus.com/incident/126115#2d73acca99a0522453a32bf6866e6daf5202dacdf4bfe2c69614affc8c315cb4 The rollback was success and all Delivery API requests now returns the correct meta status code. Some Delivery API requests falsely returned a 404 error https://www.enterspeedstatus.com/incident/126115 Thu, 13 Oct 2022 10:35:00 -0000 https://www.enterspeedstatus.com/incident/126115#27558f9ef6179881d347446248249ce6255b77598b9988222285e70dd82f15bf Around 12.45 PM CET we were notified that some Delivery API requests returned a 404 error. The error was related to a deployment of that was done at 12.35 PM CET, within minutes of us being notified we rolled back and service was fully restored at 13.03 PM CET. Reports of schemas not being saved https://www.enterspeedstatus.com/incident/109004 Fri, 12 Aug 2022 06:53:00 -0000 https://www.enterspeedstatus.com/incident/109004#eb662bb1a1633384b2008cc98f6dcfe25a6af1cc4201009e2f16c7d6d54797b6 The full service has now been restored. We have performed a roll back to a previous version while we investigate the root cause of the issue. We are sorry for the inconvenience. If you have any questions or concerns please reach out to support@enterspeed.com. Reports of schemas not being saved https://www.enterspeedstatus.com/incident/109004 Fri, 12 Aug 2022 06:32:00 -0000 https://www.enterspeedstatus.com/incident/109004#90013ec80c484371e92640ee0d5c8746ae42906be4af6a988d3e99cd613315a6 We have been notified about a problem with saving schemas. Our team are currently investigating. Infrastructure maintenance https://www.enterspeedstatus.com/incident/101758 Fri, 15 Jul 2022 04:52:28 -0000 https://www.enterspeedstatus.com/incident/101758#253aa1fa7678662f4d5cf7e434b51db996c754feadd2973549c385dc1ee83780 We are doing some behind the scenes system wide infrastructure changes that will cause processing and management app API to be out of service in various periods between the 14th of July 20.00 CET and the 15th 04.00 CET. The Ingest API will still accept incoming requests, but no processing of source entities will happen in the maintenance periode. The Management App will not be functioning during parts of the maintenance window. Reach out to support@enterspeed.com if you have any questions or concerns. Infrastructure maintenance https://www.enterspeedstatus.com/incident/101758 Fri, 15 Jul 2022 04:52:28 -0000 https://www.enterspeedstatus.com/incident/101758#253aa1fa7678662f4d5cf7e434b51db996c754feadd2973549c385dc1ee83780 We are doing some behind the scenes system wide infrastructure changes that will cause processing and management app API to be out of service in various periods between the 14th of July 20.00 CET and the 15th 04.00 CET. The Ingest API will still accept incoming requests, but no processing of source entities will happen in the maintenance periode. The Management App will not be functioning during parts of the maintenance window. Reach out to support@enterspeed.com if you have any questions or concerns. Infrastructure maintenance https://www.enterspeedstatus.com/incident/101758 Fri, 15 Jul 2022 02:00:00 +0000 https://www.enterspeedstatus.com/incident/101758#39438aeefef6fc740efd62362a94c8494149ef4a34faa3d8cf142315e6d51ab8 Maintenance completed Infrastructure maintenance https://www.enterspeedstatus.com/incident/101758 Fri, 15 Jul 2022 02:00:00 +0000 https://www.enterspeedstatus.com/incident/101758#39438aeefef6fc740efd62362a94c8494149ef4a34faa3d8cf142315e6d51ab8 Maintenance completed Infrastructure maintenance https://www.enterspeedstatus.com/incident/101758 Thu, 14 Jul 2022 18:00:00 -0000 https://www.enterspeedstatus.com/incident/101758#253aa1fa7678662f4d5cf7e434b51db996c754feadd2973549c385dc1ee83780 We are doing some behind the scenes system wide infrastructure changes that will cause processing and management app API to be out of service in various periods between the 14th of July 20.00 CET and the 15th 04.00 CET. The Ingest API will still accept incoming requests, but no processing of source entities will happen in the maintenance periode. The Management App will not be functioning during parts of the maintenance window. Reach out to support@enterspeed.com if you have any questions or concerns. Infrastructure maintenance https://www.enterspeedstatus.com/incident/101758 Thu, 14 Jul 2022 18:00:00 -0000 https://www.enterspeedstatus.com/incident/101758#253aa1fa7678662f4d5cf7e434b51db996c754feadd2973549c385dc1ee83780 We are doing some behind the scenes system wide infrastructure changes that will cause processing and management app API to be out of service in various periods between the 14th of July 20.00 CET and the 15th 04.00 CET. The Ingest API will still accept incoming requests, but no processing of source entities will happen in the maintenance periode. The Management App will not be functioning during parts of the maintenance window. Reach out to support@enterspeed.com if you have any questions or concerns. Processing of views degraded https://www.enterspeedstatus.com/incident/91824 Wed, 01 Jun 2022 04:00:00 -0000 https://www.enterspeedstatus.com/incident/91824#8ea5efc7fbccab16018c3bef02c56a9376d34da4524b1a334b1b5e34a5ff01a8 Our cloud provider is now 100 % functioning and processing is back to normal. Sorry for the inconvenience. Processing of views degraded https://www.enterspeedstatus.com/incident/91824 Tue, 31 May 2022 12:00:00 -0000 https://www.enterspeedstatus.com/incident/91824#7125679707401ffeb153039bb10e4244b708c29136757896236d0951c97471d3 Due to what appeared to be an outage in our cloud provider's western europe region processing of views in Enterspeed was affected. This resultat in Enterspeed being unable to process and serve newest delivery content from 14:00 CET Tuesday the 31st of May until ~ 06:00 CET the first of June. Processing is delayed https://www.enterspeedstatus.com/incident/84594 Thu, 28 Apr 2022 13:55:00 -0000 https://www.enterspeedstatus.com/incident/84594#d5b9d060bf600f98c2f66db8d6b913ce6a9297d6ebbd5967c469b7f50ce462e1 We have now concluded the indexing process for all regions. We are sorry for delays in processing and please reach out if you have any questions and/or will like to read a full postmortem on the incident. Processing is delayed https://www.enterspeedstatus.com/incident/84594 Thu, 28 Apr 2022 12:28:00 -0000 https://www.enterspeedstatus.com/incident/84594#9ea7ceb5b4c414cf729c41c9bbc038ea2ce15ac5798e51cfcb7a6842cbaa385f We have identified some inconsistencies in data that was ingested between ~13:00 and ~13:45. We have initiated our indexing process for the affected tenants. ETA for our WEU delivery region is 15:00 and for our SEA region ETA is 16:00 (all times CET). Processing is delayed https://www.enterspeedstatus.com/incident/84594 Thu, 28 Apr 2022 11:47:00 -0000 https://www.enterspeedstatus.com/incident/84594#6352eb41466f30f79d3e9dbde052543deeac544718a056d721d884dca486f785 We have validated the fix and implemented it in all regions. We will keep this incident open while we validate data consistency and monitor the processing services. Next update at 14:45 CET. Processing is delayed https://www.enterspeedstatus.com/incident/84594 Thu, 28 Apr 2022 11:25:00 -0000 https://www.enterspeedstatus.com/incident/84594#464cca352f8d4c96de1777c143fbc85ac7580e8a07d25a3cb600c7ac3bbd354f We have identified the issue and a fix is in the process of being validated. We will update again 14:00 CET with a new status. Processing is delayed https://www.enterspeedstatus.com/incident/84594 Thu, 28 Apr 2022 10:39:00 -0000 https://www.enterspeedstatus.com/incident/84594#f9d1c019a2064eac7fbeef08e00d8e3e558be0a3b13d121214978ed81d92a21b Our team has been alerted that processing is delayed. This results in new source entities not being made available for the Delivery API. Existing data is not affected and the Delivery API continues to function. Management API is unavailable https://www.enterspeedstatus.com/incident/79837 Mon, 04 Apr 2022 10:19:00 -0000 https://www.enterspeedstatus.com/incident/79837#24303e727875da5b3c9def4c02e42cb9306fccdaa367f0ed1d08ad27d7167784 Our Management API is back after the successfully maintenance operation. Management API is unavailable https://www.enterspeedstatus.com/incident/79837 Mon, 04 Apr 2022 10:12:00 -0000 https://www.enterspeedstatus.com/incident/79837#a56b875e87c9a89706d0486a6fd96dae56cf15d238f9dc443ad3a19fb5a82323 Our Management API is currently unavailable due to a planned maintenance operation. Enterspeed Management API fails in authentication https://www.enterspeedstatus.com/incident/76237 Wed, 16 Mar 2022 10:16:00 -0000 https://www.enterspeedstatus.com/incident/76237#ef0801307048442bd423902a87927d56e4e35caa60bd4b05dd30af1f90127dcd Our Identity Service Provider has now resolved the issue and our Management API is now able to do authentication without issues. Message from our Identity Service provider: "Between 09:13 and 10:09 UTC on 16 Mar 2022, a subset of end users of customers using Azure Active Directory B2C may have experienced failures and error notifications when attempting to authenticate. Retry attempts were likely to succeed during this incident. This issue is now mitigated." Enterspeed Management API fails in authentication https://www.enterspeedstatus.com/incident/76237 Wed, 16 Mar 2022 09:57:00 -0000 https://www.enterspeedstatus.com/incident/76237#11895c957050835d5035e82b1de146cda6f668d7ee93b6dc3bb322594a60ed3b Our Identity Service provider has acknowledged that they are currently expiring issues: https://status.azure.com/en-gb/status Enterspeed Management API fails in authentication https://www.enterspeedstatus.com/incident/76237 Wed, 16 Mar 2022 09:46:00 -0000 https://www.enterspeedstatus.com/incident/76237#017713f6338ac5ab337ff56a3bbe2b995ac070b056d45f966cc9e0d9595ed93e Some users reports intermittent issues with authentication when using our Management App. Our team is looking into the issue and monitoring the connection to our identity provider. The Enterspeed.com website was updated https://www.enterspeedstatus.com/incident/72486 Wed, 23 Feb 2022 20:38:00 -0000 https://www.enterspeedstatus.com/incident/72486#a38dd30d55ee23280a46cddee16da80c324f7507cbe60edccb8fc21d55fd5f59 The website is now back online with a valid certificate for all users. The Enterspeed.com website was updated https://www.enterspeedstatus.com/incident/72486 Wed, 23 Feb 2022 20:33:00 -0000 https://www.enterspeedstatus.com/incident/72486#3730dc7fb03c7f07e88fe90fc85d0ffcaa14ca2432848ecd4c98a4013ff45373 We have re-launched our website on a new hosting platform and all users received a broken SSL certificate while the new provider issued a new certificate.