How can we ‘reactivate’ the legacy systems, and in doing so, not only put indeed all the available data to use but also do something good for our climate?
While there are many challenges young companies might struggle with, they certainly escaped one that is a blessing and a curse at the same time – the legacy IT systems. Data is indeed one of the companies’ most valued assets, as knowledge (read, ‘data’) empowers better, more informed business decisions. Moreover, chances are your organization already has most of the knowledge it needs. The only caveat, though, is that it might be inaccessible and therefore, pretty useless.
In fact, according to various estimations, up to 97% of collected data is stored away and never gets to be used again, thus delivering zero value to the organization. Furthermore, a large portion of it comprises the so-called ‘dark data’ – data that is redundant, faulty, too old to hold any value at all, – or data that simply got forgotten.
And – here’s a surprising twist – this has an unpleasant effect on our climate. According to a recently published study by the American company Veritas Technologies, if we continue collecting and archiving all this ‘data waste’, we’ll end up with 5.8 million tons of CO2 being pumped into the atmosphere this year alone.
So, the main questions are, how can we ‘reactivate’ the legacy systems, and in doing so, not only put indeed all the available data to use but also do something good for our climate?
Most organizations are fully aware of the fact that ‘oldie’ is sometimes not ‘goldie’. In today’s world, companies are increasingly relying on digital technologies to develop new products, increase operational efficiency, reduce costs, and – most importantly – ensure that customers have a satisfying experience.
So, how come that many companies still end up with sometimes decades-old systems that either run in parallel or are simply redundant, thereby driving up costs for maintenance and potentially, posing security risks and compliance issues?
Sometimes, a not entirely thought-through data migration strategy is the culprit. A company decides to replace one mission-critical piece of technology with a better one. To avoid disruption to the business operation, it lets both systems run in parallel “for a limited period” until data migration is over. Only to find out that not all data can be migrated as it is due to, for example, outdated and incompatible formats. Or the business logic of the legacy system wasn’t understood as well as it had been assumed, and the cumbersome source code is the only documentation the IT team has.
Another reason could be that the legacy system is – in the truest sense of the word – a legacy from a merger and acquisition. It’s a common issue when an acquired company brings into “the new relationship” outdated, inefficient, or redundant systems. Typically, both sides are aware of this and try to address the issue already during the pre-acquisition diligence process. However, often, this results in devising patches and workarounds that would merely help overcome the limitations of such systems. Consequently, instead of solving the issue efficiently and in the long term, companies often end up with incompatibilities among individual layers of the technology stack.
In the end, whether due to mergers and acquisitions, unforeseen issues during software replacement, or simply as a consequence of organic growth, the result is the same. Potentially valuable data – hence, knowledge – is locked away and inaccessible for use.
#integration #api integration #legacy #integration platform #legacy data #api
Install via pip:
$ pip install pytumblr
Install from source:
$ git clone https://github.com/tumblr/pytumblr.git $ cd pytumblr $ python setup.py install
pytumblr.TumblrRestClient is the object you'll make all of your calls to the Tumblr API through. Creating one is this easy:
client = pytumblr.TumblrRestClient( '<consumer_key>', '<consumer_secret>', '<oauth_token>', '<oauth_secret>', ) client.info() # Grabs the current user information
Two easy ways to get your credentials to are:
interactive_console.pytool (if you already have a consumer key & secret)
client.info() # get information about the authenticating user client.dashboard() # get the dashboard for the authenticating user client.likes() # get the likes for the authenticating user client.following() # get the blogs followed by the authenticating user client.follow('codingjester.tumblr.com') # follow a blog client.unfollow('codingjester.tumblr.com') # unfollow a blog client.like(id, reblogkey) # like a post client.unlike(id, reblogkey) # unlike a post
client.blog_info(blogName) # get information about a blog client.posts(blogName, **params) # get posts for a blog client.avatar(blogName) # get the avatar for a blog client.blog_likes(blogName) # get the likes on a blog client.followers(blogName) # get the followers of a blog client.blog_following(blogName) # get the publicly exposed blogs that [blogName] follows client.queue(blogName) # get the queue for a given blog client.submission(blogName) # get the submissions for a given blog
PyTumblr lets you create all of the various types that Tumblr supports. When using these types there are a few defaults that are able to be used with any post type.
The default supported types are described below.
We'll show examples throughout of these default examples while showcasing all the specific post types.
Creating a photo post
Creating a photo post supports a bunch of different options plus the described default options * caption - a string, the user supplied caption * link - a string, the "click-through" url for the photo * source - a string, the url for the photo you want to use (use this or the data parameter) * data - a list or string, a list of filepaths or a single file path for multipart file upload
#Creates a photo post using a source URL client.create_photo(blogName, state="published", tags=["testing", "ok"], source="https://68.media.tumblr.com/b965fbb2e501610a29d80ffb6fb3e1ad/tumblr_n55vdeTse11rn1906o1_500.jpg") #Creates a photo post using a local filepath client.create_photo(blogName, state="queue", tags=["testing", "ok"], tweet="Woah this is an incredible sweet post [URL]", data="/Users/johnb/path/to/my/image.jpg") #Creates a photoset post using several local filepaths client.create_photo(blogName, state="draft", tags=["jb is cool"], format="markdown", data=["/Users/johnb/path/to/my/image.jpg", "/Users/johnb/Pictures/kittens.jpg"], caption="## Mega sweet kittens")
Creating a text post
Creating a text post supports the same options as default and just a two other parameters * title - a string, the optional title for the post. Supports markdown or html * body - a string, the body of the of the post. Supports markdown or html
#Creating a text post client.create_text(blogName, state="published", slug="testing-text-posts", title="Testing", body="testing1 2 3 4")
Creating a quote post
Creating a quote post supports the same options as default and two other parameter * quote - a string, the full text of the qote. Supports markdown or html * source - a string, the cited source. HTML supported
#Creating a quote post client.create_quote(blogName, state="queue", quote="I am the Walrus", source="Ringo")
Creating a link post
#Create a link post client.create_link(blogName, title="I like to search things, you should too.", url="https://duckduckgo.com", description="Search is pretty cool when a duck does it.")
Creating a chat post
Creating a chat post supports the same options as default and two other parameters * title - a string, the title of the chat post * conversation - a string, the text of the conversation/chat, with diablog labels (no html)
#Create a chat post chat = """John: Testing can be fun! Renee: Testing is tedious and so are you. John: Aw. """ client.create_chat(blogName, title="Renee just doesn't understand.", conversation=chat, tags=["renee", "testing"])
Creating an audio post
Creating an audio post allows for all default options and a has 3 other parameters. The only thing to keep in mind while dealing with audio posts is to make sure that you use the external_url parameter or data. You cannot use both at the same time. * caption - a string, the caption for your post * external_url - a string, the url of the site that hosts the audio file * data - a string, the filepath of the audio file you want to upload to Tumblr
#Creating an audio file client.create_audio(blogName, caption="Rock out.", data="/Users/johnb/Music/my/new/sweet/album.mp3") #lets use soundcloud! client.create_audio(blogName, caption="Mega rock out.", external_url="https://soundcloud.com/skrillex/sets/recess")
Creating a video post
Creating a video post allows for all default options and has three other options. Like the other post types, it has some restrictions. You cannot use the embed and data parameters at the same time. * caption - a string, the caption for your post * embed - a string, the HTML embed code for the video * data - a string, the path of the file you want to upload
#Creating an upload from YouTube client.create_video(blogName, caption="Jon Snow. Mega ridiculous sword.", embed="http://www.youtube.com/watch?v=40pUYLacrj4") #Creating a video post from local file client.create_video(blogName, caption="testing", data="/Users/johnb/testing/ok/blah.mov")
Editing a post
Updating a post requires you knowing what type a post you're updating. You'll be able to supply to the post any of the options given above for updates.
client.edit_post(blogName, id=post_id, type="text", title="Updated") client.edit_post(blogName, id=post_id, type="photo", data="/Users/johnb/mega/awesome.jpg")
Reblogging a Post
Reblogging a post just requires knowing the post id and the reblog key, which is supplied in the JSON of any post object.
client.reblog(blogName, id=125356, reblog_key="reblog_key")
Deleting a post
Deleting just requires that you own the post and have the post id
client.delete_post(blogName, 123456) # Deletes your post :(
A note on tags: When passing tags, as params, please pass them as a list (not a comma-separated string):
client.create_text(blogName, tags=['hello', 'world'], ...)
Getting notes for a post
In order to get the notes for a post, you need to have the post id and the blog that it is on.
data = client.notes(blogName, id='123456')
The results include a timestamp you can use to make future calls.
data = client.notes(blogName, id='123456', before_timestamp=data["_links"]["next"]["query_params"]["before_timestamp"])
# get posts with a given tag client.tagged(tag, **params)
This client comes with a nice interactive console to run you through the OAuth process, grab your tokens (and store them for future use).
pyyaml installed to run it, but then it's just:
$ python interactive-console.py
and away you go! Tokens are stored in
~/.tumblr and are also shared by other Tumblr API clients like the Ruby client.
The tests (and coverage reports) are run with nose, like this:
python setup.py test
If you accumulate data on which you base your decision-making as an organization, you should probably think about your data architecture and possible best practices.
If you accumulate data on which you base your decision-making as an organization, you most probably need to think about your data architecture and consider possible best practices. Gaining a competitive edge, remaining customer-centric to the greatest extent possible, and streamlining processes to get on-the-button outcomes can all be traced back to an organization’s capacity to build a future-ready data architecture.
In what follows, we offer a short overview of the overarching capabilities of data architecture. These include user-centricity, elasticity, robustness, and the capacity to ensure the seamless flow of data at all times. Added to these are automation enablement, plus security and data governance considerations. These points from our checklist for what we perceive to be an anticipatory analytics ecosystem.
#big data #data science #big data analytics #data analysis #data architecture #data transformation #data platform #data strategy #cloud data platform #data acquisition
The opportunities big data offers also come with very real challenges that many organizations are facing today. Often, it’s finding the most cost-effective, scalable way to store and process boundless volumes of data in multiple formats that come from a growing number of sources. Then organizations need the analytical capabilities and flexibility to turn this data into insights that can meet their specific business objectives.
This Refcard dives into how a data lake helps tackle these challenges at both ends — from its enhanced architecture that’s designed for efficient data ingestion, storage, and management to its advanced analytics functionality and performance flexibility. You’ll also explore key benefits and common use cases.
As technology continues to evolve with new data sources, such as IoT sensors and social media churning out large volumes of data, there has never been a better time to discuss the possibilities and challenges of managing such data for varying analytical insights. In this Refcard, we dig deep into how data lakes solve the problem of storing and processing enormous amounts of data. While doing so, we also explore the benefits of data lakes, their use cases, and how they differ from data warehouses (DWHs).
This is a preview of the Getting Started With Data Lakes Refcard. To read the entire Refcard, please download the PDF from the link above.
#big data #data analytics #data analysis #business analytics #data warehouse #data storage #data lake #data lake architecture #data lake governance #data lake management
Using data to inform decisions is essential to product management, or anything really. And thankfully, we aren’t short of it. Any online application generates an abundance of data and it’s up to us to collect it and then make sense of it.
Google Data Studio helps us understand the meaning behind data, enabling us to build beautiful visualizations and dashboards that transform data into stories. If it wasn’t already, data literacy is as much a fundamental skill as learning to read or write. Or it certainly will be.
Nothing is more powerful than data democracy, where anyone in your organization can regularly make decisions informed with data. As part of enabling this, we need to be able to visualize data in a way that brings it to life and makes it more accessible. I’ve recently been learning how to do this and wanted to share some of the cool ways you can do this in Google Data Studio.
#google-data-studio #blending-data #dashboard #data-visualization #creating-visualizations #how-to-visualize-data #data-analysis #data-visualisation
The COVID-19 pandemic disrupted supply chains and brought economies around the world to a standstill. In turn, businesses need access to accurate, timely data more than ever before. As a result, the demand for data analytics is skyrocketing as businesses try to navigate an uncertain future. However, the sudden surge in demand comes with its own set of challenges.
Here is how the COVID-19 pandemic is affecting the data industry and how enterprises can prepare for the data challenges to come in 2021 and beyond.
#big data #data #data analysis #data security #data integration #etl #data warehouse #data breach #elt