2018 update on the OERu Technology Stack

As I prepare for the 2018 OERu Partners' Meeting, I'm working how to convey the depth and breadth of the OERu open source technology stack - our infrastructure and applications - to our partners and other attendees.

Over the past year, we've had our first live OERu courses allowing learners to work towards a formal exit qualification (our "1st year of study"), and our infrastructure has worked as intended throughout, so I'm chalking that up as a win.

OERu Infrastructure - NGDLE

At the OERu, in following our open principles, we have ended up creating a "Next Generation Digital Learning Environment" (NGDLE) to meet the needs of our learners, partners, and OER collaborators. It's a distributed, loosely coupled component model. It's also entirely made up of Free and Open Source software, from top to bottom.

OERu Technology Service wheel - the basis for our Next Generation Digital Learning Environment
Services making up parts of the OERu's NGDLE. These are either hosted by the community or on the OERu's fully open source technology infrastructure.

We believe our approach has a lot of advantages and, if emulated by our partners and other academic institutions, could revolutionise both the quality of digital services used in education, as well as vastly reducing costs and increasing the autonomy and resilience of technical solutions while providing unprecedented technology-related learning opportunities and agency for learners and educators alike. The purpose of this post is to describe our technology infrastructure, explain some of its advantages, and highlight the challenges it presents.

A key takeaway: if any of our partners adopted even one of the technologies we have incorporated into our NGDLE, they could easily save many times the value of their annual OERu subscription fees  in the first year, and in every subsequent year. This blog exists to provide handy howtos to make it easy for our partners to trial and adopt these technologies in a maintainable way.

Geographic Diversity

In my role of Open Source Technologist, I maintain our global computing infrastructure, currently made up of four separate nodes, on behalf of the OERu:

  1. u.oerfoundation.org (5.9.142.102) - a dedicated server hosted in an Hetzner Online facility in Gunzenhausen, Bavaria, Germany.
  2. open.oerfoundation.org (40.127.81.149) - a compute instance hosted in a Microsoft Azure facility in Melbourne, Victoria, Australia (charitable organisation grant from TechSoup).
  3. containers.oerfoundation.org (202.49.241.195) - a compute instance hosted in a Catalyst Cloud facility in Wellington, New Zealand.
  4. we6.wikieducator.org (54.218.47.90) - a compute instance hosted in an Amazon Web Services facility in Portland, Oregon, USA.
A map view of the OERu's open source tech infrastructure.
This image is linked to a live OpenStreetMap view (open source alternative to Google Maps) - there you can zoom into each facility in which our hosting infrastructure is held. This interactive map was created using MapCustomiser.

Functional Diversity and Containers

All of these servers run the open source operating system, Linux, and with the exception of we6.wikieducator.org, run multiple services in Docker containers. We6 is the original OERu server and it runs the technology on which our first initiative is built: a MediaWiki implementation answering to https://wikieducator.org - it's where all our curriculum materials are assembled and maintained.

The story on our other hosts is more complicated. Here's a run down of the fully open source services each provides:

  1. u.oerfoundation.org is currently running a number of services, including 26 Docker containers, which collectively provide the following services:
    1. Our SilverStripe OERu website, https://oeru.org, which is our main contact with prospective partners and our course advertising platform.
    2. Our WordPress Multisite, https://course.oeru.org, our primary course delivery platform (each course instance is a separate "sub-site" but a user on our course site can enrol in any course sub-site), to which we can automatically deploy fully formed courses from WikiEducator via our Snapshotting system.
    3. Our Discourse Forums:
      1. https://community.oeru.org - for our educator and partner community
      2. https://forums.oeru.org - for our learners
    4. Our Rocket.Chat instance, https://chat.oeru.org, for real-time communication - text & media messages, audio and video calls - within our team, and with the broader partner and learner community. We also host a couple other Rocket.Chat instances for well aligned groups and partners trialling it. Or you could run your own Rocket.Chat...
    5. Our WEnotes Course Feed stack made up of several containers that monitor various social media and blogs, and then scan, store, and serve up our dynamic course feeds - for example.
    6. Our Matomo (formerly Piwik) instance, https://stats.oeru.org, allows us to track the use of our websites ourselves. It's functionally similar to Google Analytics, but without giving information on our web visitors to Google.
    7. Additionally this server hosts a number of native services (not in containers) including
      1. YourLS, our link shortener on https://oer.nz, and
      2. Semantic Scuttle, on https://bookmarks.oeru.org.
    8. Finally, we run a number of test systems including replicas of WikiEducator and our Course site.
  2. open.oerfoundation.org runs 27 Docker containers which collectively provide
    1. Our Mastodon instance, https://mastodon.oeru.org, our open education corner of the "Fediverse" - it's a micro-blogging platform similar to Twitter, but distributed - without the centralised corporate control and surveillance capitalism business model. Here's how to run your own Mastodon...
    2. Our Collabora Office + NextCloud instance, https://doc.oeru.org, which together are functionally similar to Google Drive or Dropbox + Google Docs, offering web based document storage, combined with the ability to collaboratively edit documents, spreadsheets, and presentations. Here's how to make your own...
    3. Our Etherpad-Lite instance, https://etherpad.oeru.org, for collaborative note taking.
    4. Our Mautic instance, https://mautic.oeru.org, which manages and automates our email out to learners and partners, as well as  prospective partners and learners. It's an invaluable tool for magnifying the responsiveness and effectiveness of our small OER Foundation team. Replicate our approach to run your own.
    5. Our Lime Survey instance, https://survey.oeru.org, where we manage our learner and partner surveys.
    6. Our Moodle instance, https://moodle.oeru.org, for various assessment related testing and other activities.
    7. Our Wallabag instance, https://wallabag.oeru.org, a very slick shared bookmarking tool, which we're evaluating as a replacement for Semantic Scuttle.
    8. Our Drupal 8 CMS instance, that powers this very Tech Blog!
    9. Our own BitWarden instance, https://safe.oeru.org, as our password keeper for managing our passwords and ensuring we have strong passwords, that we can optionally share in specific organisation-level groups.
    10. We also run a couple of test websites built on a blogging/CMS platform called "Grav". Watch this space.
  3. containers.oerfoundation.org is our newest computing resource, based in New Zealand (made available through generous sponsorship from Catalyst Cloud). It is intended to run containers hosting services that are local to us. Its current set of Docker containers runs our Gitlab instance, at https://git.oeru.org - you can view our existing projects you're welcome to use, learn from, and improve - to which we are moving all of our remote software code repositories. Any software developers are encouraged to have a look around and make use of our code! Contributions are always welcome.

Agile and flexible

One of the best things about our "loosely coupled component" model is that we have the ability to quickly incorporate useful new capabilities, reinforce and update components, and readily replace weaker components with stronger ones without sentimentality as we identify them - our open source policy allows rapid evolution without the sentimentality that the sunk-cost fallacy tends to create.

In recent months, we have tested a number of new open source technologies, and introduced a few of those into our component mix, including Lime Survey, Gitlab, and BitWarden, as well as enjoyed steadily improving capabilities - features, security, privacy, and performance - by tracking community-driven updates to most of our existing components.

On the back end, there have been even more improvements, like our move from normal Docker to Docker Compose which has revolutionised our ability to rapidly deploy and move around services between hosting infrastructure.

Side by side comparison of Semantic Scuttle and Wallabag.
This is a side-by-side comparison of Semantic Scuttle (left) and Wallabag (right).

We have also been able to identify and test potentially better components for certain uses - for example,  our "Bookmarking" tool, provided by an instance of Semantic Scuttle, is not as modern as many of our other components, and its interface is decidedly dowdy by comparison. We have since found another social bookmarking tool, Wallabag, which promises to provide a more usable service, with high quality open source mobile apps available, as well as having in-built support for a variety of Single-Sign-On technologies which Semantic Scuttle lacks.

Capacity and scalability

The OERu is starting small - with tens or hundreds of learners participating in our courses. This places minimal load on our infrastructure, but allows us to validate that it is working as intended. But for many software implementations, those sorts of numbers might already be challenging the ability of normally available computing infrastructure to supply a usable service.

One of the major advantages of our loosely coupled component model, making use of best-of-breed open source applications working together, is that each of those components is in active use in other contexts. They have all already had their "trial-by-fire", confronting the challenges of efficiently supporting large numbers of users (by large, I mean not tens or hundreds, but millions or tens of millions) and they have already evolved to meet those challenges.

Although the applications we choose as components are all the products of different communities, different developers, and different technologies, all adhere to a set of well tested robust and scalable internet software service patterns. All have separate data stores (mostly databases, including MariaDB,  PostgreSQL, MongoDB, CouchDB, and SQLite) themselves decoupled from the containers doing the computing, usually running scripting engines (we use PHP, Ruby on Rails, Python, and Node.JS) that manipulate that data in a "stateless" way pushing and pulling data from users' browsers which employ open standards compliant HTML5 (comprising HTML markup, CSS for styling, and Javascript for in-browser client application functionality).

That model makes it possible to scale up all of these services simply by adding more computing containers (which is facilitated by the use of Docker).

Costs

One way that the OERu maintains its capabilities with such a small infrastructure and development budget is that we adhere to a few key principles:

  1. Use commodity open source hosting (we only run Linux), allowing for rapid movement between hosting providers with minimal trouble or disruption to our services.
  2. If using a Software-as-a-Service (SaaS) solution, strongly prefer open source options which gives us a safety valve if the pricing model/service doesn't suit our needs. (At this point, Zoom is the only proprietary software in our stack). This largely removes vendor lock-in.
  3. Ensure any external purchased service is fixed price and does not increase with number of users.
  4. Include internal maintenance time in cost of ownership.

As described above, we use four main hosting providers across the globe. All of our computing instances and dedicated machines are commodity system without proprietary features. We never exceed the (usually very generous) in-built data and storage allotments, so our prices are fixed and predictable.

The OERu's entire annual infrastructure/IT costs can be summarised as follows (values approximate, in USD):

  • Four Servers: AWS ($4000) + Hetzner ($440) + Azure ($0) + Catalyst Cloud ($0) = $4440.
  • SaaS: Zoom ($180) + Kanboard ($360)

Total annual software + infrastructure budget: $4800

Some of the OER Foundation's hosting infrastructure costs are covered by a "Charitable Organisation Grant" which includes $5000/year worth of Microsoft Azure hosting services. We also gratefully receive $500/month sponsored hosting services from the NZ-based hosting provider, Catalyst Cloud, who provide a fully open source cloud hosting infrastructure, and like the work the OERu is doing!

Commodity

We run Linux on all of them (Ubuntu or Debian) which is available at no cost. We can run as many or as few systems as we want at a fixed cost. Only the sunk cost of my time and the relative computing resource requirements are variable - but these vary in far less than a linear fashion with user numbers, e.g. for 10 times more users, we might require only 10% more of my time, and maybe 50% more computing resources.

Case Study: SaaS and the value of open source

At last year's Partners' meeting, we described Mautic, an impressive new open source "marketing automation" tool we were using to automate much of our email communication with learners and partners as well as prospective learners and partners.

Initially, to test it, we opted to use the $30/month SaaS "entry-level" offering from Mautic.com which allowed us a single login to get started right away and test the Mautic system's fit to our requirements. It allowed us up to 2000 contacts, with an increased cost beyond that.

Shortly after last partner meeting, however, the folks at Mautic.com contacted us to say that their pricing model was changing and that our new costs would go up by more than 10 times - $500/month, and with a much steeper increase for new contacts. For example, 10k contacts would cost us more like $1000/month.

Mautic SaaS vs. Self-Hosted
Relative cost models of the hosted (SaaS) Mautic service vs. the self-hosted OERu Mautic instance.

Given the substantial price rise and unpleasant "scaling factor" for new contacts, combined with the fact that we found Mautic a very useful piece of software, we were in an uncomfortable position. Thankfully the Mautic platform itself is open source software. So unlike all proprietary SaaS offerings we had the option of hosting the service ourselves.

In the course of a day or two of my time invested into creating an implementation on our hosting infrastructure (which I have documented), we had https://mautic.oeru.org up and running.

Our new cost profile for Mautic is far more favourable. I spend perhaps an hour or so every month or two to update the Mautic system (it's improving continually thanks to the efforts of its Mautic.com-led developer community - which now includes the OERu!). Mautic places a negligible additional load on our AU-based infrastructure, but I have assigned it a small proportion of the (donated) hosting cost, a bit of my time, and a negligible amount for outgoing email costs (the 20k or so emails our Mautic has sent so far using the AWS "Simple Email Service" commodity SaaS have cost us a whopping $0.50 so far).

Overall, the per-month cost-comparison for 10000 contacts looks like its about $1000 vs. $80, or an annual savings of approximately $11,040 off a total of $12,000. Massive 92% savings, amounting to twice our total infrastructure budget.

And the savings only increases as we grow as we full intend to do! If, for example, we ended up with 100k contacts in the next year, the SaaS cost would be more like $4000/month. For 100k contacts, our self hosted system's costs might increase to $120/month (due to increased infrastructure resources being used)... The resulting savings would then be $46,560 off a total of $48,000 - a non-trivial savings of 97% or almost 10 times our total infrastructure budget!

More importantly, the absolute cost of our environment is a tiny fraction, perhaps 5-10%, of that represented by a managed, outsourced SaaS offering. The open source self-hosted approach we've been able to validate represents a huge opportunity, particularly for higher education in emerging economies.

Progress since last year

In addition to building and maintaining the above open source software stack, our small team on occasionally identifies opportunities act decisively and strategically, to write our own code that bridges gaps or improves the usability of components in our NGDLE. Since the last OERu partner meeting, we've rolled up our sleeves to create the Blog Feed Finder, and we're about to release our new Registration and Enrolment plugin for WordPress MultiSite implementations (still a work-in-progress).

We have lots of other strategic initiatives stacked up and waiting to receive our full attention. The main constraint on that progress is our capacity, namely me (as the OER Foundation's Open Source Technologist). Mine is a very gratifying role, but a bit daunting at times - so much to learn and do!

What this means for OERu Partners

So, what should our OERu partners take from this description of the OERu's NGDLE? I think the most important thing is this: the status quo for higher learning institution IT conventions is neither the only way to do things, nor is it the best way.

Because we are unbound to convention or historical decisions, we at the OERu are able to pioneer new approaches, from a "technology expert" perspective. We are driven by open principles and very tight resource constraints, but we also need fulfil our vision: to build a rich, fit-for-purpose infrastructure for learners and OER collaborators alike, which has the potential to scale to unprecedented global learner volumes.

Implementing an open source end-to-end service gives the OERu a unique perspective and experience compared to organisations who only implement the occasional open source component in the midst of IT infrastructure dominated by proprietary software that is costly and extremely restrictive by comparison.

We are also building (anonymous) monitoring systems into these services to ensure we can measure our success (without impinging on the privacy of our learners or collaborators). The insight we gain may well be of value to our partners.

Return on (open source technology) Investment

Return on Investment (ROI) can be achieved in a number of ways: investment can increase productivity and therefore value created (usually measured as profit) or it can be achieved by reducing costs. The best ROI achieves a combination of both.

Our OERu partners can safely appraise our NGDLE and pick and choose from among the many (always improving) best-of-breed open source components that might fill an emerging need within their own institutional context. We can provide assistance and demonstration instances to help them initiate pilot implementations and work with local advocates and experts to provide ongoing support.

Any partner interested in the capabilities of technologies we have included in our NGDLE has the potential to save their institutions many times the cost of their OERu partner membership if they adopt any one of them - along the lines of the Mautic case study above... And, of course, that savings increases with the number of technologies adopted. 

 

Blog comments

This is a very impressive accomplishment Dave. The only thing that wrinkles my brow is that a significant chunk of OERU spending is going to the very corporate datafarms we're trying to replace (Microsoft and Amazon), by developing software and services that respects their users' software freedom. If we are swapping from gratis services managed by the datafarms, to taking on management work ourselves while starting to pay them for hosting, is this actually progress? They still make all the money, but now we pay them for doing some of their work for them ;-P

I'm much happier to see OERU money going to Catalyst. As you know, having worked for them, they are not and have never been datafarmers. They have been involved in free code development and deployment in both the commercial and governmental sectors in Aotearoa, and huge supporters of the libre commons communities in the country (including sponsoring and organizing the NZ Open Source Awards). It would be great to read your next annual report, and see that all the OERU infrastructure hosted on the datafarms has been moved to companies like Catalyst. It's the final piece of the puzzle :)

In reply to by Danyl Strype (not verified)

Fair comment, Danyl. Yes, we agonised over the use of Microsoft's Azure cloud quite a bit... we decided in the end that, given our use of their service is within their USD5000 annual charitable organisation grant (so we're taxing their services rather than funding them) and our policy of actively avoiding the use of any service-specific proprietary tools (i.e. we treat all our hosting services as commodity services), we're effectively exploiting their efforts to lock organsations into their offering (and convert them to paying customers) and using them to promote their antithesis, namely fully open, community focused services... 

The use of Amazon is somewhat less contentious (given that Amazon hasn't historically been actively hostile to FOSS like Microsoft has been for most of its history), and although we're currently paying for their service (albeit not very much) we're similarly treating their services as commodity, so that the instant we're able to identify an alternative service that's fit for our purpose but is better aligned with our principle, we can move there toot-sweet! :)

Keep on keeping us honest!

In reply to by dave

Thanks for acknowledging the issues Dave, and great to be able to see the details of your reasoning. Can I suggest you include these details in future reports where these hosting choices are mentioned? Pleased to hear the OERu is not actually giving Microsoft any money :)

> Amazon hasn't historically been actively hostile to FOSS like Microsoft has been for most of its history

True, but many of their business activities are bad for software freedom, and human freedom in general. The one that always sticks in my head is the centralized censorship power that allowed them to delete copies of George Orwell's '1984' from thousands of Amazon Swindle devices. A long list of their other bad behaviours can be found here, with links to sources:
https://stallman.org/amazon.html

> the instant we're able to identify an alternative service that's fit for our purpose but is better aligned with our principle, we can move there toot-sweet! :)

Good to hear. Could you perhaps write more about exactly what these hosts provide that make them difficult to find a replacement for? Such a description could make an excellent topic for a future blog post. It may be that someone who reads it is inspired to set up a suitable service, or can point you to one that already exists.

Keep up the great work, both on the open tech stack itself and on the open documentation practice, which is equally important IMHO.

Add new comment

Plain text

  • No HTML tags allowed.
  • Lines and paragraphs break automatically.
  • Web page addresses and email addresses turn into links automatically.
CAPTCHA
4 + 6 =
Solve this simple math problem and enter the result. E.g. for 1+3, enter 4.
Are you the real deal?