Looking into Luigi: A Workflow Management System Review

Data intelligence relies on a strong, functional data pipeline. However, the workflows that feed those pipelines can be rather arbitrarily complex.

Building, connecting, and maintaining complex workflows add unnecessary work for data engineers.

It’s not an arrangement that works in the fast-paced world of enterprise software.

Fortunately, developers can lower their workload with tools like Luigi.

What Is Luigi?

Spotify created and maintains Luigi, a workflow engine whose philosophy and concepts were inspired by GNU Make.

It’s a Python module that provides a framework for building and running complex pipelines of batch jobs.

What problem does Luigi solve?

Luigi’s main function is to take care of workflow management so developers can focus on other concerns.

It can be used to help build data pipeline tasks like declaring dependencies between tasks or defining the inputs and outputs of each task.

On top of creating data pipeline tasks, Luigi helps run them. It’s a good tool for handling dependencies, providing visualization tools, and handling and reporting failures.

When used with a central scheduler it can also enable distributed execution.

Benefits of Luigi

  • Smoothly resume data workflow after a failure.
  • Parametrize and re-run tasks on a schedule (daily, hourly, or as needed) with the help of an external trigger.
  • Organize code with shared patterns.
  • Command line integration.
  • Small overhead for a task (about 4 lines: class, def requires, def output, def run).
  • Everything is done by inheriting Python classes.
  • Can be extended with other tasks such as Spark jobs, Hive queries, and more.

Strengths of Luigi

Modular code makes software more reliable and easier to main and update.

With Luigi, writing modular code is simple. Developers can easily create complicated dependencies between tasks.

Better yet, managing those dependencies is equally straightforward.

Luigi’s simple API lets users build a build a highly complex tree of dependencies without making it too difficult to understand.

Other team members or outside maintainers can easily interpret the code.

Luigi is highly flexible. It relies on Python, which allows developers the freedom to create tasks that do anything needed.

Connecting components is easy and intuitive.

There’s no external or static configuration for the pipelines, only Python scripts, so everything is dynamic.

Last – but not least – is idempotency. Completed tasks are not run twice, so a failed workflow can be restarted from the middle.

It picks up right where it left off, which produces the same output every time.

Weaknesses of Luigi

One of Luigi’s main weaknesses is the flip-sides of one of its biggest strengths.

Specifically, it can’t re-run partial or old pipelines since it picks up where it left off.

It also has no native support for distributed execution.

Developers need to use a central controller to gain that functionality.

Some have found Luigi’s user interface to be hard to navigate.

This is one of the biggest reasons users move to Airflow, though with some practice the UI issue becomes less noticeable.

The biggest complaints of developers who’ve worked with Luigi revolve around issues with scaling.

There are two reasons for the tool’s scalability issues:

  • The number of Luigi worker processes is limited by the number of cron worker processes currently assigned to the job.
  • The web UI and scheduler run on a single threaded process. If the scheduler is busy or someone else is using the UI, the web UI suffers from frustratingly slow performance.

Comparison

Airflow (Airbnb)

Airbnb uses a lot of data heavy features: price optimization for hosts, property recommendations for guests, and internal tracing features to guide business decisions.

They created Airflow to meet their specific data needs, then decided to open source it in 2015.

It’s flexible and scalable, but users have experienced some problems with time zones, managing the scheduler, and unexpected backfills.

Pinball (Pinterest)

Pinterest created Pinball when they found none of the existing workflow management solutions met their requirements for customizability.

It has a lot of features and scales horizontally very well.

The community is small, though, and it doesn’t have good documentation.

Real-life Application

In practice Luigi is used for ETL (extract, transform, load) operations that feed data intelligence operations.

Luigi handles batch jobs, not streaming, continuous processes.

It’s not a data integration software, but it can be used to orchestrate custom data integration tasks.

Future outlook

Right now, Airflow is a more popular tool for workflow management.

Luigi still has its supporters and there are areas where it has the edge over Airflow and Pinball, but unless it can address its scalability issue it may not be able to maintain its user base going forward.

Every development project has unique needs. At Concepta, we build with tools chosen for each project to create a custom solution for every client. Claim your free consultation to see what we can do for your company!

Request a Consultation

Is JSON Schema the Tool of the Future?

json-schema

JSON Schema is a lightweight data interchange format that generates clear, easy-to-understand documentation, making validation and testing easier.

JSON Schema is used to describe the structure and validation constraints of JSON documents.

Some have called it “the future for well-developed systems that have nested structures”.

There’s some weight to those claims; it’s definitely become a go-to tool for those who get past its steep learning curve.

Reviewing the Basics

JSON, which is the acronym for JavaScript Object Notation, is a lightweight data-interchange format.

It’s easy for humans to read and write, and equally easy for machines to parse and generate.

JSON Schema is a declarative language for validating the format and structure of a JSON Object.

It describes how data should look for a specific application and how it can be modified.

There are three main parts to JSON Schema:

JSON Schema Core

This is the specification where the terminology for a schema is defined.

Schema Validation

The JSON Schema validation is a document which explains how validation constraints may be defined. It lists and defines the set of keywords which can be used to specify validations for a JSON API.

Hyper-schema

This is where keywords associated with hyperlinks and hypermedia are defined.

What Problem Does JSON Schema Solve?

Schemas in general are used to validate files before use to prevent (or at least lower the risk of) software failing in unexpected ways.

If there’s an error in the data, the schema fails immediately. Schemas can serve as an extra quality filter for client-supplied data.

Using JSON Schema solves most of the communication problems between the front-end and the back-end, as well as between ETL (Extract, Transform and Load) and data consumption flows.

It creates a process for detailing the format of JSON messages in a language both humans and machines understand. This is especially useful in test automation.

Strengths of JSON Schema

The primary strength of JSON Schema is that it generates clear, human- and machine-readable documentation.

It’s easy to accurately describe the structure of data in a way that developers can use for automating validation.

This makes work easier for developers and testers, but the benefits go beyond productivity.

Clearer language allows developers to spot potential problem faster, and good documentation leads to more economical maintenance over time.

Weaknesses of JSON Schema

JSON Schema has a surprisingly sharp learning curve.

Some developers feel it’s hard to work with, dismissing it as “too verbose”. Because of the criticism, it isn’t well known.

Using JSON Schema makes projects grow quickly. For example, every nested level of JSON adds two levels of JSON Schema to the project.

This is a weakness common to schemas, though, and depending on the project it may be outweighed by the benefits. It’s also worth considering that JSON Schema has features which keep the size expansion down.

For example, objects can be described in the “definitions section” and simply referenced later.

What Else Is There?

Some developers prefer to use Mongoose, an Object Document Mapper (ODM) that allows them to define schemas, then create models based on those schemas.

The obvious drawback is that an extra abstraction layer delivers a hit to performance.

Another option is Joi, a validation library used to create schemas for controlling JavaScript objects. The syntax is completely different, though, and Joi works best for small projects.

Sometimes developers jump into a new MongoDB with a very flexible schema. This inevitably dooms them to “schema hell”, where they lose control as the project grows.

When JSON Schema Is the Right Choice

Performance is undeniably important. However, there are times when the cost of recovering from mistakes is far higher than the cost of taking the speed hit that comes with schema validation.

For those times the performance drop isn’t large enough to justify the risk of bad data entering the system, and that’s where JSON Schema comes into play.

JSON Schema is proving itself as a development option, but there’s no single “best tool” for every project. Concepta takes pride in designing a business-oriented solution that focuses on delivering value for our clients. To see what that solution might look like for your company, reserve your free consultation today!

Request a Consultation

CrossBrowserTesting Review: The Good, The Bad, And The Alternatives

crossbrowsertesting-review

CrossBrowserTesting is a cloud-based testing platform for websites and apps.

It provides powerful tools for exploring how a website performs on a variety of devices and browsers, and it’s considered a staple for many developers.

Why is CrossBrowserTesting Important?

User experience has taken center stage in the battle for customer acquisition and retention. It’s more important than ever to make sure every user is having the same experience, no matter where they are or how they get to the website.

The quest for uniformity is made harder by the huge variety of browsers in use. Besides the “Big Five” (which, as of 2018, include Chrome, Firefox, Safari, Internet Explorer, and Edge) there are scores of alternative browsers that may be more common within a particular customer base.

To complicate matters further, new internet-ready devices are being invented every day. Developers have to plan for traditional computers and laptops as well as tablets, phones, and wearable devices. Even fitness equipment and some appliances can access the internet now.

There’s no practical way to manually test every browser and device that might be used by customers. However, it’s risky to ignore the less popular options. The average person uses 3 different connected devices over the course of a day.

If they happen to find the site or app on an unsupported platform, they’re unlikely to return through another device later.

CrossBrowserTesting provides a range of manual and simulated testing services on more than 1,500 devices and browsers. Developers use it to check whether their clients’ websites are rendering properly across the board.

CrossBrowserTesting provides a full, explorable picture of how well the product functions, ensuring better quality and a more consistent user experience.

Strengths of CrossBrowserTesting

What developers like about CrossBrowserTesting is that it’s easy to use. They can test their layouts on a huge combination of operating systems, browsers, and resolutions simultaneously. It shortens the amount of time they have to spend on testing without lowering quality.

CrossBrowserTesting uses real browsers instead of emulators. It’s also one of the very few tools offering manual testing on physical iOS and Android devices, with coverage for the Nexus, Galaxy, Tab2, and hundreds of other devices. Developers can swipe and interact with the devices during their tests.

Automated screenshot comparisons and testing videos give developers an overview of their site’s status. They can perform apples-to-apples comparisons and quickly identify platforms that need more attention.

Additionally, CrossBrowserTesting offers Selenium automation for mobile and desktop browsers.

Weaknesses of CrossBrowserTesting

Virtual machines for mobile devices can be painfully slow (though they’re still faster than individually testing each device). It’s also not easy to edit end to end testing.

Some developers have also had problems with CrossBrowserTesting’s documentation. They report typos, structure issues, and some misleading examples that caused unnecessary confusion.

What Else Is Out There?

While CrossBrowserTesting is a powerful tool, most developers use a variety of different tests to get fuller coverage. Here are some popular alternatives:

SmartBear TestComplete

SmartBear owns both CrossBrowserTesting and TestComplete. TestComplete offers more powerful testing options across technologies, and it’s easier to create tests. However, the code editor provides badly-formatted code and some users have struggled with frequent crashes.

BrowserStack

BrowserStack is a testing platform that boasts a large variety of developer tools and supports many different devices and browsers. Reliability has been an issue for some, though: the test client runs differently on different operating systems, and the virtual machines reject new tests instead of queuing them when overloaded.

Cypress.io

Cypress.io is an end-to-end testing tool built on Node. It’s easy to debug with a highly supportive community. The problem is that Cypress doesn’t support more than one browser instance and doesn’t support native events (such as file uploads).

Future Outlook

In a world where digital transformation is key to staying relevant, developers are pushing to work faster without sacrificing quality. CrossBrowserTesting helps them stay on schedule while still creating a dynamic end product. With results like that, it’s not going anywhere.

At Concepta, we know that testing isn’t just something to cross off a checklist. It’s an integral part of development that directly impacts our clients’ bottom lines. That’s why we use test automation tools like CrossBrowserTesting to make sure users are getting the same great service wherever they are. Set up a free, no obligation consultation to discuss testing your company’s products to find out how they perform in the real world!

Request a Consultation

What is Firebase Review: Why Developers are Fired Up About Firebase

firebase-review

Firebase has been growing fast since Google acquired it in 2014. Developers praise it as a way to keep up with technical demands of modern enterprise.

The powerful mobile and web app development platform provides a healthy suite of tools for building and growing highly scalable apps, all within a shorter time frame that fits digital transformation efforts.

What exactly does Firebase bring to the table? Here are the five features most commonly cited by its community of supporters.

What is Firebase?

Firebase is a “backend as a Service (BaaS)”, meaning there is no server infrastructure needed. This shortens development time and removes a layer of complexity for developers.

The best thing about BaaS, though, is that it frees developers from the tedium of building out a backend. Instead, they can direct all of their focus to creating dynamic, user-oriented apps.

Firebase has a Huge Feature Set

One of the Firebase’s biggest draws is its robust, well-tested feature set. It has tools for nearly everything a developer could need. Some, like Google analytics, are built in free.

Other can be incorporated as needed, such as:

  • Authentication
  • Hosting
  • Push notifications
  • Real-time messaging
  • Cloud storage
  • Performance monitoring

These can all be used independently of each other. Developers have the option to buy only what they need instead of getting locked into a huge bundle they won’t use.

NoSQL

SQL databases are geared towards highly structured data. For less organized data like comments, photos, and reviews the flexibility of a NoSQL database is better.

Firebase handles large datasets and bi-directional references easily. That makes it a good choice for Big Data operations for which the object-oriented approach doesn’t work as well.

Firebase is Economical

Price is one of the most pressing priorities during development. Firebase lowers the initial investment by using a subscription service model.

The beginning tiers of Firebase are cheap or even free initially. Companies can publish their app and start working towards OIR much faster than when the backend has to be built from scratch.

There’s also the fact mentioned earlier, that developers can buy only what is needed at the time. That goes for both features and cloud storage.

By the time more storage is necessary, the app should be on its way to earning back its development costs

Firebase is Enterprise-Oriented

All of these qualities contribute to Firebase’s most compelling benefit: its focus on enterprise. The platform is optimized for the kind of real-time and streaming apps that help a company stand out among its competitors.

It’s startup friendly, enabling growing companies to build cross-platform apps fast and economically. Apps start small, with just a little cloud storage, and scale as their usage grows.

Plus, Firebase offers easy social authentication integration. Allowing customers to log in with their social media solves several pressing business problems.

Visitors stay on the site longer, share posts more often, and are shown targeted ads that improve their user experience.

Firebase Limitations

Some developers are understandably wary of platform dependency. While it’s true that Firebase has more tools for migrating than it did at launch, it’s still reliant on Google.

The Parse shutdown is still fresh in the industry’s memory, so this is a risk some prefer to avoid altogether.

Configuring the database and security settings are on the complex side. It takes a Firebase expert to set it up without accidentally making sensitive sections public.

There are the querying limitations common to NoSQL to think about, too.

Long-term cost is another consideration. At lower levels it’s inexpensive, but at scale it can get pricey.

Final Thoughts

Firebase is a dynamic platform with a lot of potential. No tool is perfect for every situation, but where Firebase fits it serves as a welcome shortcut for companies trying to work through their development priorities.

Could Firebase be the platform you’re looking for? Schedule a free consultation with one of Concepta’s developers to discuss your new project and whether Firebase is the right fit.

Request a Consultation

NativeScript: Choosing a Mobile App Framework

NativeScript-review

NativeScript is a modern mobile open source framework for building native apps.

NativeScript is a Mobile Framework, or a customizable “skeleton” that contains a common selection of features necessary for building mobile apps.

What problem does NativeScript solve?

Developers can use JS, Angular, XML, and CSS-like languages to develop apps while employing native APIs for each platform instead of using web views to render each UI.

Benefits of NativeScript

  • Open source
  • Extensible
  • Platform-agnostic
  • Native API Reflection
  • Uses native UI components from the native OS (fully native apps, not packaged via browser)
  • Standards-compliant ECMAScript Javascript code
  • CSS standard-compliant declaration
  • Hot reload functionality

Strengths of NativeScript

Although it’s a relatively new tool NativeScript is backed by Telerik, a subsidiary of Progress. Progress has a reputation for backing dependable developer tools for enterprise.

It has an enthusiastic community who appreciate how easy it is for Angular fans to learn NativeScript.

Users can actually build native applications in Angular 2 as well as use CSS animations.

Cross platform functionality is essential, and NativeScript is well poised to ensure it.

The framework offers native performance on both iOS and Android. The whole stack is available for both platforms.

NativeScript espouses a “write once, adapt everywhere” philosophy: developers can adapt code between web and mobile. It also provides 0-day support for new OS releases.

Intellectual Property headaches can be bypassed with an open source program like NativeScript.

Its core is licensed under Apache 2.0 which allows users to use, modify, license, and distribute their software without having to classify their use as personal or commercial.

There are only very minimal requirements to leave a disclaimer and copyright notice in place.

NativeScript includes great tooling for productivity and developer consistency.

Implementation is noticeably faster compared to mobile.

Also, developers can integrate CSS animations into their NativeScript projects.

Weaknesses of NativeScript

Most of NativeScript’s weaknesses arise from being relatively new.

For example, there aren’t as yet many official plug-ins.

NativeScript has vocally loyal and active users, but it’s not as big as ReactNative’s community.

Developers can’t use React with NativeScript yet. Only their “Core”, an MVVM style of building applications, is available.

However, Angular 2 is also has first class support and there are strongly indications that Vue will be implemented as well.

Performance has historically not been as seamless on Android as on iOS.

A recent update ironed out many of the problems, but some issues remain.

Comparison

Xamarin

Xamarin is very similar to NativeScript in terms of underlying technology.

Though NativeScript doesn’t have as many available plug-ins as Xamarin, it has an edge in code recompilation. Changing code in Xamarin means recompiling and restarting the computer.

NativeScript is positioned for automatic instant recompiling, which is part of how it drives faster development.

ReactNative

Both ReactNative and NativeScript are cross-platform JavaScript frameworks for mobile development.

The key difference is in their ultimate goals: ReactNative is working towards being able to learn one tool and write for every platform while NativeScript wants to achieve shared code.

Ionic

Ionic tends to emphasize performance over cross-platform compatibility.

The concern with this is that ReactNative can match it in performance (at least for iOS, and nearly for Android) with a much gentler learning curve after updates.

Real-life application

NativeScript is used by well-known companies such as Verizon, Deliotte Digital, Bitpoints Wallet, and Daily Nanny.

One notably complex application is ShoutOutPlay: This app lets users record personal messages and embed them within tracks in a Spotify playlist.

NativeScript was used when developing for different platforms as it let the developers write both iOS and Android UIs with one concise XML language.

Conclusion

NativeScript has two core strengths: speed of development and cross-platform optimization.

If those are the core priorities for your project, it will be a powerful tool to aid in mobile app development.

If you need highly experienced mobile app developers, share with us your challenges and we’ll help come up with the right solution tailored to fit your needs.

Request a Consultation