Questions and answers

We are interested in becoming a client. What would you need from us to make the collaboration successful?

We need four roles to be fulfilled:

A champion. Someone who succeeds or fails with us, and can pull strings within the company to make things work out, e.g. to get access to data.
A product manager. Someone who owns the priorities in the shared backlog, and helps us prioritise and refine the backlog items.
A subject matter expert. Someone to answer detailed questions about the data we are processing, and how it fits into your products.
A technical contact. Someone to help us sort out technical integration details, e.g. for ingesting data sources.

These roles can be combined into one or more persons.

Do you sell our data or use for your own purposes?

No, you own your data, and we only process it on your behalf, and for your benefit. Our relation in this aspect is similar to the relation between a cloud provider and cloud customers.

Where is my data stored?

With a major cloud provider, within the EU.

Who pays the cloud bill?

Scling pays cloud providers and other suppliers that we use. It is included in the pricing for operations. The pricing includes the cost of storing a single copy of ingested data for 10 years.

What is your SLA or availability?

Most pipelines do not require a strict SLA, and customers should not pay for it. So by default, the SLA is “best effort” with email support on working hours. For pipelines with higher requirements, raising the SLA level one step is a deliverable, with a small increase in operational cost. SLA level beyond “best effort” requires a customer engagement of at least 6 deliverables per month.

I’d like a complex feature. Is that one deliverable?

No, we work together to break down complex features into small deliverables - as small as they can get while still providing some value to you as a customer. The agile workshop “Elephant Carpaccio” is a good exercise for learning to break down complex features into small deliverables. For example, if you want a recommendation API, it might be split into a handful of deliverables, e.g:

Minimum viable product: Emit daily file with recommendations of the most popular items based on the sales source. This provides some minimal business value, since it can be compared with real sales for evaluation.
Combine sales with a demographic data source, recommend popular based on country.
Serve recommendations in an unauthenticated API.
Add basic API authentication.
Recommend popular items based on age and country.
Add user history, avoid recommending previously bought items.
Make recommendations personalised, with basic collaborative filtering.
…

As you can see, each deliverable is small, and that is important. In order to build valuable data products, each step should be evaluated in order to determine the next step. In many cases, well-tuned simple solutions work as well as complex algorithms. We should only use new shiny things that are expensive to build and operate where they really matter, and always benchmark them to simple alternatives, or combine both.

Some deliverables are so easy for you - why do you charge the same amount for all?

Our value proposition is that we are proficient with data engineering, have built data platforms many times, for many years, have the appropriate tooling, and can use our knowledge and machinery to be more efficient and take your data features to production quicker. Some deliverables will seem easy for us, since we apply our tools and patterns that are well known to us, but might take others more time to figure out. In such cases, Scling profits. Other deliverables will require more work, but would have been riskier and taken a long time for companies with less experience and without adequate tooling. Building recommendation systems or respecting the right to be forgotten in a data lake are such examples. In those cases, you profit. Over time, we share the profit from our partnership.

How can you process our data without our expertise?

Domain expertise is crucial for success. For some customers, such as media or retail, the domain is comprehensible by laymen. In those cases, knowledge transfer through meetings and documents is sufficient. In other cases, e.g. manufacturing, learning the domain takes time, and customers may have valuable algorithms to contribute. In such cases, subject-matter experts from the customers embed with us, and we develop the solutions together. It requires customer to spend work time, but that time spent is also an intensive course in practical data engineering for customer staff, so the benefit is mutual.

Will code or data be shared with your other customers?

Your data is not shared. We share reusable code among our customers. That is one of the benefits for our customers - shared development and maintenance costs. The shared code is typically technical or generic, and not specific to your applications. For common domains, such as web and retail, we share reusable domain-specific code and definitions between customers. We do not share corporate secrets, and if you want a particular innovation not to be shared, we can comply.

I’d like to sell my data, can you help me?

Yes, we can handle the technical arrangements. If your data is covered by the GDPR, you cannot sell it, only lease it out. In that case, we can arrange for user requests for deletion or withdrawn consent to be passed on to the leasee.

Is my data secure?

We have more than a decade of experience with secure cloud environments, and we apply cloud security best practices, e.g. hardware-based multifactor authentication for personal credentials and asset management with infrastructure as code. We use standard practices for developing applications based on open source software, i.e. take security precautions that do not significantly hinder the development process or add excessive complexity. All security has a cost, and for some types of security hardening, there is a tradeoff. The right level depends on the sensitivity of data, and should be chosen by each customer. For example, we do not want customers that ingest publicly available data to pay the cost of strict manual security procedures. For other customers, manual change validation, strict open source dependency lockdown, additional protection layers, and external penetration testing might be justified.

We are happy to be transparent with our processes, as well as apply stricter security procedures when needed. Security hardening would be one form of development deliverable, and we can provide a suitable backlog of hardening deliverables based on threat modelling.

We handle ingested data in compliance with GDPR regulations, including minimising access, applying anonymisation and pseudonymisation where possible, limiting data retention, respecting consent, providing user data extracts, and respecting the right to be forgotten. Adding technical compliance solutions is one form of development deliverables.

As a customer, you have the relation to end users, and are therefore the data controller, and must implement additional procedures in order to be compliant, e.g. receive deletion requests, and pass them to us. In GDPR terminology, we are a data processor. We have implemented technical GDPR compliance solutions for multiple companies, and are happy to work together to ensure that your data handling overall is GDPR-compliant. We do not provide legal advice, however.

Can you run your data platform in my data center? Or in my home country?

We can run in an environment that provides a Kubernetes cluster, scalable storage, a relational database, and sufficiently secure access control. The pricing will be different than the fully hosted solution, however, and will depend on whether you supply infrastructure, and what procedures are required. In case you want us to run in a particular location where there are no suitable cloud providers, but take care of the infrastructure for you, we will team up with suitable partners to operate the underlying infrastructure that we need.

How do we leave Scling’s service? Can we take over data pipeline operations?

If you decide to leave, you can take over the operations of developed pipelines. You get a copy of the data processing code, as well as any libraries and operational configuration necessary to run the pipelines. The platform is built on open source technology and cloud services available on any of the major clouds, in order not to lock our customers to proprietary technology. In order to put our money where our mouth is, we can offer to port data flows at a predetermined fixed price per flow once you have established an adequate destination environment.

If you leave, you do not get access to our internal operational tools or monitoring tools that are not required to execute the pipelines. We have operation automation tools that generate code and configuration, e.g. for Kubernetes. When you leave, you will get the generated files, which you then maintain and change when you want to change your pipelines.

If we leave Scling, will we be required to use your providers?

We avoid using commercial providers that would make it difficult for our customers to leave our service. We do use a major cloud provider, but restrict ourselves to using cloud services that are available with all the three major cloud providers. All services that we rely on also have open source equivalents for customers that wish to move to an on-premise installation. We use no other commercial services for production-cricital components.

What do I need to provide in order to exit?

You need to use a major cloud service, or install the following components in an on-premise datacenter:

Kubernetes
Scalable file storage, such as a cloud object store or a Hadoop cluster
A relational database
A Kafka cluster, if you have requested stream processing data flows.
Authentication and authorisation infrastructure

For development and operations, you will also need:

A git repository server
A continuous integration build server
A system for collecting logs
A system for collecting and monitoring metrics

There are no specific requirements on these systems, so you can connect your data platform to the corresponding systems that you are already operating. If you are not operating such systems, any IT consultancy firm can do it for you.

We are a data component or platform vendor. We think you should use our product.

For the reasons explained above, we will not take dependencies on commercial vendors, except for the major cloud providers.

Scling.

Questions and answers

We are interested in becoming a client. What would you need from us to make the collaboration successful?

Do you sell our data or use for your own purposes?

Where is my data stored?

Who pays the cloud bill?

What is your SLA or availability?

I’d like a complex feature. Is that one deliverable?

Some deliverables are so easy for you - why do you charge the same amount for all?

How can you process our data without our expertise?

Will code or data be shared with your other customers?

I’d like to sell my data, can you help me?

Is my data secure?

Can you run your data platform in my data center? Or in my home country?

How do we leave Scling’s service? Can we take over data pipeline operations?

If we leave Scling, will we be required to use your providers?

What do I need to provide in order to exit?

We are a data component or platform vendor. We think you should use our product.

Contact us

Scling.

Questions and answers

We are interested in becoming a client. What would you need from us to make the collaboration successful?

Do you sell our data or use for your own purposes?

Where is my data stored?

Who pays the cloud bill?

What is your SLA or availability?

I’d like a complex feature. Is that one deliverable?

Some deliverables are so easy for you - why do you charge the same amount for all?

How can you process our data without our expertise?

Will code or data be shared with your other customers?

I’d like to sell my data, can you help me?

Is my data secure?

Is the data processing compliant with GDPR?

Can you run your data platform in my data center? Or in my home country?

How do we leave Scling’s service? Can we take over data pipeline operations?

If we leave Scling, will we be required to use your providers?

What do I need to provide in order to exit?

We are a data component or platform vendor. We think you should use our product.

Contact us