Ian  Robinson

Ian Robinson

1629587040

Reasons Data Scientists Must be Data Engineers in 2021

They still have an opportunity to make an amazing comeback with the help of data science. Yes, it is necessary for engineers to learn data science in 2021, in order to keep their place in the job market. Data science is a blend of mathematics, machine learning, business decision tools, and algorithms. It helps businesses bring out knowledge and insight from structured and unstructured data.

With data becoming the center of decision-making in almost every industry, the demand for data science professionals has also surged in the recent past. On the other hand, engineers are highly skilled professionals who need a switch. Most engineers are looking for ways to shift from their engineering jobs to data science or the big data industry to stay ahead in the job marke

#big-data #big-data 

What is GEEK

Buddha Community

Reasons Data Scientists Must be Data Engineers in 2021

Migrating From Jira Server: Guide, Pros, And Cons

February 15, 2022 marked a significant milestone in Atlassian’s Server EOL (End Of Life) roadmap. This was not the final step. We still have two major milestones ahead of us: end of new app sales in Feb 2023, and end of support in Feb 2024. In simpler words, businesses still have enough time to migrate their Jira Server to one of the two available products – Atlassian Cloud or Atlassian DC. But the clock is ticking. 

Jira Cloud VS Data Center

If we were to go by Atlassian numbers, 95% of their new customers choose cloud. 

“About 80% of Fortune 500 companies have an Atlassian Cloud license. More than 90% of new customers choose cloud first.” – Daniel Scott, Product Marketing Director, Tempo

So that’s settled, right? We are migrating from Server to Cloud? And what about the solution fewer people talk about yet many users rely on – Jira DC? 

Both are viable options and your choice will depend greatly on the needs of your business, your available resources, and operational processes. 

Let’s start by taking a look at the functionality offered by Atlassian Cloud and Atlassian DC.

FeatureAtlassian CloudAtlassian Data Center
Product PlansMultiple plansOne plan
BillingMonthly and annualAnnual only
Pricing modelPer user or tieredTiered only
SupportVarying support levels depending on your plan: Enterprise support coverage is equivalent to Atlassian’s Data Center Premier Support offeringVarying support levels depending on the package: Priority Support or Premier Support (purchased separately)
Total Cost of OwnershipTCO includes your subscription fee, plus product administration timeTCO includes your subscription fee and product administration time, plus: costs related to infrastructure provisioning or IaaS fees (for example, AWS costs) planned downtime time and resources needed for software upgrades
Data encryption services✅❌
Data residency services✅❌
Audit loggingOrganization-level audit logging available via Atlassian Access (Jira Software, Confluence) 

Product-level audit logs (Jira Software, Confluence)
Advanced audit logging
Device securityMobile device management support (Jira Software, Confluence, Jira Service Management)

Mobile application management (currently on the roadmap)
Mobile device management support (Jira Software, Confluence, Jira Service Management) 
Content security✅❌
Data Storage limits2 GB (Free)

250 GB (Standard)

Unlimited storage (Premium and Enterprise)
No limits
PerformanceContinuous performance updates to improve load times, search responsiveness, and attachments

Cloud infrastructure hosted in six geographic regions to reduce latency
 
Rate limitingCDN supports Smart mirrors and mirror farms (Bitbucket)
Backup and data disaster recoveryJira leverages multiple geographically diverse data centers, has a comprehensive backup program, and gains assurance by regularly testing their disaster recovery and business continuity plans. 

Backups are generated daily and retained for 30 days to allow for point-in-time data restoration
❌
Containerization and orchestration✅Docker images

Kubernetes support (on the roadmap for now)
Change management and upgradesAtlassian automatically handles software and security upgrades for you Sandbox instance to test changes (Premium and Enterprise) 

Release track options for Premium and Enterprise (Jira Software, Jira Service Management, Confluence)
❌
Direct access to the databaseNo direct access to change the database structure, file system, or other server infrastructure

Extensive REST APIs for programmatic data access
Direct database access
Insights and reportingOrganization and admin insights to track adoption of Atlassian products, and evaluate the security of your organization.Data Pipeline for advanced insightsConfluence analytics

Pros and cons of Jira Cloud

When talking about pros and cons, there’s always a chance that a competitive advantage for some is a dealbreaker for others. That’s why I decided to talk about pros and cons in matching pairs. 

Pro: Scalability is one of the primary reasons businesses are choosing Jira Cloud. DC is technically also scalable, but you’ll need to scale on your own whereas the cloud version allows for the infrastructure to scale with your business. 

Con: Despite the cloud’s ability to grow with your business, there is still a user limit of 35k users. In addition to that, the costs will grow alongside your needs. New users, licenses, storage, and computing power – all come at an additional cost. So, when your organization reaches a certain size, migrating to Jira DC becomes more cost-efficient.

Pro: Jira takes care of maintenance and support for you.

Con: Your business can suffer from unpredicted downtime. And there are certain security risks.  

Pro: Extra bells and whistles: 

  • Sandbox: Sandbox is a safe environment system admins can use to test applications and integrations before rolling them out to the production environment. 
  • Release tracks: Admins can be more flexible with their product releases as they can access batch and control cloud releases. This means they’ll have much more time to test existing configurations and workflows against a new update. 
  • Insight Discovery: More data means more ways you can impact your business or product in a positive, meaningful way. 
  • Team Calendars: This is a handy feature for synchronization and synergy across teams. 

Con: Most of these features are locked behind a paywall and are only available to either Premium and Enterprise or only Enterprise licenses (either fully or through addition of functionality. For example, Release tracks are only available to Enterprise customers.) In addition, the costs will grow as you scale the offering to fit your growing needs. 

Pros and cons of Jira Data Center

I’ll be taking the same approach to talking about the pros and cons as I did when writing about Atlassian Cloud. Pros and cons are paired. 

Pro: Hosting your own system means you can scale horizontally and vertically through additional hardware. Extension of your systems is seamless, and there is no downtime (if you do everything correctly). Lastly, you don’t have to worry about the user limit – there is none. 

Con: While having more control over your systems is great, it implies a dedicated staff of engineers, additional expenses on software licensing, hardware, and physical space. Moreover, seamless extension and 0% downtime are entirely on you.

Pro: Atlassian has updated the DC offering with native bundled applications such as Advanced Roadmaps, team calendars and analytics for confluence, insight asset management, and insight discovery in Jira Service Management DC.

Con: Atlassian has updated their pricing to reflect these changes. And you are still getting fewer “bells and whistles” than Jira Cloud users (as we can see from the feature comparison). 

Pro: You are technically safer as the system is supported on your hardware by your specialists. Any and all Jira server issues, poor updates, and downtime are simply not your concern.
 

Con: Atlassian offers excellent security options: data encryption in transit and rest, to mobile app management, to audit offerings and API token controls. In their absence, your team company has to dedicate additional resources to security. 

Pro: Additional benefits from Atlassian, such as the Priority Support bundle (all DC subscriptions have this option), and the Data center loyalty discount (more on that in the pricing section.)

The Pricing

Talking about pricing of SaaS products is always a challenge as there are always multiple tiers and various pay-as-you go features. Barebones Jira Cloud, for instance, is completely free of charge, yet there are a series of serious limitations. 

Standard Jira Cloud will cost you an average of $7.50 per user per month while premium cranks that price up to $14.50. The Enterprise plan is billed annually and the cost is determined on a case-by-case basis. You can see the full comparison of Jira Cloud plans here. And you can use this online calculator to learn the cost of ownership in your particular case.

50 UsersStandard (Monthly/Annually)Premium (Monthly/Annually)
Jira Software$387.50 / $3,900$762.50 / $7,650
Jira Work Management$250 / $2,500❌
Jira Service Management$866.25 / $8,650$2,138.25 / $21,500
Confluence$287.50 / $2,900$550 / $5,500
100 UsersStandard (Monthly/Annually)Premium (Monthly/Annually)
Jira Software$775 / $7,750$1,525 / $15,250
Jira Work Management$500 / $5,000❌
Jira Service Management$1,653.75 / $16,550$4,185.75 / $42,000
Confluence$575 / $5,750$1,100 / $11,000
500 UsersStandard (Monthly/Annually)Premium (Monthly/Annually)
Jira Software$3,140 / $31,500$5,107.50 / $51,000 
Jira Work Management$1,850 / $18,500❌
Jira Service Management$4,541.25 / $45,400$11,693.25 / $117,000
Confluence$2,060 / $20,500$3,780 / $37,800

Please note that these prices were calculated without any apps included. 

Jira Data Center starts at $42,000 per year and the plan includes up to 500 users. If you are a new client and are not eligible for any discounts*, here’s a chart that should give you an idea as to the cost of ownership of Jira DC. You can find more information regarding your specific case here.

UsersCommercial Annual PlanAcademic Annual Plan
1-500USD 42,000USD 21,000
501-1000USD 72,000USD 36,000
1001-2000USD 120,000USD 60,000
Confluence for Data Center  
1-500USD 27,000USD 13,500
501-1000USD 48,000USD 24,000
1001-2000USD 84,000USD 42,000
Bitbucket for Data Center  
1-25USD 2,300USD 1,150
26-50USD 4,200USD 2,100
51-100USD 7,600USD 3,800
Jira Service Management for Data Center  
1-50USD 17,200USD 8,600
51-100USD 28,600USD 14,300
101-250USD 51,500USD 25,750

*Discounts:

  • Centralized per-user licensing allows users access all enterprise instances with a single Enterprise license.
  • There’s an option for dual licensing for users who purchase an annual cloud subscription with 1,001 or more users. In this case, Atlassian extends your existing server maintenance or Data Center subscription for up to one year at a 100% discount.
  • There are certain discounts for apps depending on your partnership level.
  • Depending on your situation, you may qualify for several Jira Data Center discount programs:

What should be your User Migration strategy?

Originally, there were several migration methods: Jira Cloud Migration Assistant, Jira Cloud Site Import, and there was an option to migrate via CSV export (though Jira actively discourages you from using this method). However, Jira’s team has focused their efforts on improving the Migration Assistant and have chosen to discontinue Cloud Site Import support.

Thanks to the broadened functionality of the assistant, it is now the only go-to method for migration with just one exception. If you are migrating over 1000 users and you absolutely need to migrate advanced roadmaps – you’ll need to rely on Site Import. At least for now, as Jira is actively working on implementing this feature in their assistant.

Here’s a quick comparison of the options and their limitations.

 FeaturesLimitations
Cloud Migration AssistantApp migration

Existing data on a Cloud Site is not overwritten

You choose the projects, users, and groups you want to migrate

Jira Service Management customer account migration

Better UI to guide you through the migration

Potential migration errors are displayed in advance

Migration can be done in phases reducing the downtime

Pre- and post-migration reports
You must be on a supported self-managed version of Jira
Site ExportCan migrate Advanced RoadmapsApp data is not migrated

Migration overrides existing data on the Cloud site

Separate user importUsers from external directories are not migrated

No choice of data you want or don’t want migrated

There’s a need to split attachments into up to 5GB chunks

Higher risks of downtime due to the “all or nothing” approach

You must be on a supported self-managed version of Jira

Pro tip: If you have a large base of users (above 2000), migrate them before you migrate projects and spaces. This way, you will not disrupt the workflow as users are still working on Server and the latter migration of data will take less time. 

How to migrate to Jira Cloud

Now that we have settled on one particular offering based on available pricing models as well as the pros and the cons that matter the most to your organization, let’s talk about the “how”. 

How does one migrate from Jira Server to Jira Cloud?

Pre-migration checklist

Jira’s Cloud Migration Assistant is a handy tool. It will automatically review your data for common errors. But it is incapable of doing all of the work for you. That’s why we – and Atlassian for that matter – recommend creating a pre-migration checklist.   

Smart Checklist will help you craft an actionable, context-rich checklist directly inside a Jira ticket. This way, none of the tasks will be missed, lost, or abandoned. 

Below is an example of how your migration checklist will look like in Jira. 

Feel free to copy the code and paste it into your Smart Checklist editor and you’ll have the checklist at the ready. 

# Create a user migration plan #must
> Please keep in mind that Jira Cloud Migration Assistant migrates all users and groups as well as users and groups related to selected projects
- Sync your user base
- Verify synchronization
- External users sync verification
- Active external directory verification
## Check your Jira Server version #must
- Verify via user interface or Support Zip Product Version Verification
> Jira Migration Assistant will not work unless Jira is running on a supported version
## Fix any duplicate email addresses #must
- Verify using SQL
> Duplicate email addresses are not supported by Jira Cloud and therefore can't be migrated with the Jira Cloud Migration Assistant. To avoid errors, you should find and fix any duplicate email addresses before migration. If user information is managed in an LDAP Server, you will need to update emails there and sync with Jira before the migration. If user information is managed locally, you can fix them through the Jira Server or Data Center user interface.
## Make sure you have the necessary permissions #must
- System Admin global permissions on the Server instance
- Exists in the target Cloud site
- Site Administrator Permission in the cloud
## Check for conflicts with group names #must
- Make sure that the groups in your Cloud Site don't have the same names as groups in Server
> Unless you are actively trying to merge them
- Delete or update add-on users so not to cause migration issues
- Verify via SQL
## Update firewall allowance rules #must
- None of the domains should be blocked by firewall or proxy
## Find a way to migrate apps #must
- Contact app vendors
## Check public access settings #must
- Projects
- Filters
- Filters
- Boards
- Dashboards
## Review server setup #mst
- at least 4gb Heap Allocation
- Open Files limit review
- Verify via support zip
## Check Server timezone #must for merging Cloud sites
- Switch to UTC is using any other timezone
> Add a system flag to the Jira Server instance -Duser.timezone=UTC as outlined in this article about updating documentation to include timezone details.
## Fix any duplicate shared configuration
## Storage limits
## Prepare the server instance
- Check data status
- All fields have value and are not null
-Any archived projects you wish to migrate are activated
## Prepare your cloud site
- Same Jira products enabled
- Same language
- User migration strategy
## Data backup
- Backup Jira Server site
- Backup Cloud site
## Run a test migration
- Done
## Notify Jira support
- Get in touch with Jira migration support

Use backups

On the one hand, having all of your Jira products on a server may seem like a backup in and of itself. On the other hand, there are data migration best practices we should follow even if it’s just a precaution. No one has ever felt sorry for their data being too safe. 

In addition, there are certain types of migration errors that can be resolved much faster with having a backup at hand. 

  1. Jira Server Database backup: this step creates a DB backup in an XML format.
    1. Log in with Jira System Admin permissions
    2. Go to system -> Import and Export -> Backup Manager -> Backup for server.
    3. Click the create Backup for server button. 
    4. Type in the name for your backup. 
    5. Jira will create a zipped XML file and notify you once the backup is ready. 

  1. Jira Cloud Backup: This backup also saves your data in an XML format. The process is quite similar to creating a Jira Server backup with the only difference taking place on the Backups page.
    1. Select the option to save your attachments, logos, and avatars.
    2. Click on the Create backup button. 

  1. As you can see, the Cloud backup includes the option to save attachments, avatars, and logos. This step should be done manually when backing up Server data.
    1. Create a Zip archive for this data
    2. Make sure it follows the structure suggested by Atlassian

Migrating your Jira instance to the cloud via the Jira Migration Assistant

Jira Cloud Migration Assistant is a free add-on Atlassian recommends using when migrating to the cloud. It accesses and evaluates your apps and helps migrate multiple projects. 

Overall, the migration assistant offers a more stable and reliable migration experience. It automatically checks for certain errors. It makes sure all users have unique and valid emails, and makes sure that none of the project names and keys conflict with one another. 

This is a step-by-step guide for importing your Jira Server data backup file into Jira Cloud.

  1. Log into Jira Cloud with admin permissions
  2. Go to System -> Import and Export -> External System Import
  3. Click on the Jira Server import option

  1. Select the backup Zip you have created 
  2. Jira will check the file for errors and present you with two options: enable or disable outgoing mail. Don’t worry, you will be able to change this section after the migration process is complete. 
  3. Then you will be presented with an option to merge Jira Server and Jira Cloud users
    1. Choosing overwrite will replace the users with users from the imported files
    2. The merge option will merge groups with the same name
    3. Lastly, you can select the third option if you are migrating users via Jira’s assistant
  4. Run the import

How do you migrate Jira Server into Jira DC?

Before we can proceed with the migration process, please make sure you meet the following prerequisites:

  1. Make sure you are installing Jira on one of the supported platforms. Atlassian has a list of supported platforms for Jira 9.1.
  2. Make sure the applications you are using are compatible with Jira DC. You will be required to switch to datacenter-compatible versions of your applications (they must be available). 
  3. Make sure you meet the necessary software and hardware requirements:
    1. You have a DC license
    2. You are using a supported database, OS, and Java version
    3. You are using OAuth authentication if your application links to other Atlassian products

Once you are certain you are ready to migrate your Jira Server to Jira Data Center, you can proceed with an installation that’s much simpler than one would expect.

  1. Upgrade your apps to be compatible with Jira DC
  2. Go to Administration -> Applications -> Versions and licenses
  3. Enter your Jira DC License Key
  4. Restart Jira

That’s it. You are all set. Well, unless your organization has specific needs such as continuous uptime, performance under heavy loads, and scalability, in which case you will need to set up a server cluster. You can find out more about setting up server clusters in this guide.  

 iOS App Dev

iOS App Dev

1624072920

10 Must-have Skills for Data Engineering Jobs

Big data skills are crucial to land up data engineering job roles. From designing, creating, building, and maintaining data pipelines to collating raw data from various sources and ensuring performance optimization, data engineering professionals carry a plethora of tasks. They are expected to know about big data frameworks, databases, building data infrastructure, containers, and more. It is also important that they have hands-on exposure to tools such as Scala, Hadoop, HPCC, Storm, Cloudera, Rapidminer, SPSS, SAS, Excel, R, Python, Docker, Kubernetes, MapReduce, Pig, and to name a few.

Here, we list some of the important skills that one should possess to build a successful career in big data.

1. Database Tools
2. Data Transformation Tools
3. Data Ingestion Tools
4. Data Mining Tools

#big data #latest news #data engineering jobs #skills for data engineering jobs #10 must-have skills for data engineering jobs #data engineering

 iOS App Dev

iOS App Dev

1622608380

Data Engineer, Data Scientists & Other Data Careers, Explained.

This week, take part in our survey and let us know where you recently applied Data Science, Analytics and Machine Learning. Also: Data Scientist, Data Engineer & Other Data Careers, Explained; A Guide On How To Become A Data Scientist (Step By Step Approach); A checklist to track your Data Science progress; How to Determine if Your Machine Learning Model is Overtrained; Differentiable Programming from Scratch; and much, much more.

Our new KDnuggets Top Blogs Reward Program will pay to the authors of top blogs - check details here. Reposts accepted, but we love original submissions, rewarded at 3 times the rate of reposts.

#kdnuggets 2021 issues #analytics #careers #data engineer #data engineering #data science #data scientist #poll #survey

 iOS App Dev

iOS App Dev

1620466520

Your Data Architecture: Simple Best Practices for Your Data Strategy

If you accumulate data on which you base your decision-making as an organization, you should probably think about your data architecture and possible best practices.

If you accumulate data on which you base your decision-making as an organization, you most probably need to think about your data architecture and consider possible best practices. Gaining a competitive edge, remaining customer-centric to the greatest extent possible, and streamlining processes to get on-the-button outcomes can all be traced back to an organization’s capacity to build a future-ready data architecture.

In what follows, we offer a short overview of the overarching capabilities of data architecture. These include user-centricity, elasticity, robustness, and the capacity to ensure the seamless flow of data at all times. Added to these are automation enablement, plus security and data governance considerations. These points from our checklist for what we perceive to be an anticipatory analytics ecosystem.

#big data #data science #big data analytics #data analysis #data architecture #data transformation #data platform #data strategy #cloud data platform #data acquisition

 iOS App Dev

iOS App Dev

1622564160

Why You Should Consider Being a Data Engineer Instead of a Data Scientist

I just want to say that whether you choose data science or data engineering should ultimately depend on your interests and where your passion lies. However, if you’re sitting on the fence, unsure of which to choose because they are of equal interest, then keep reading!

Data science has been a hot topic for a while, but a new king of the jungle has arrived — data engineers. In this article, I’m going to share with you several reasons why you might want to consider pursuing data engineering over data science.

Note that this IS an opinionated article and take what you want from this. That being said, I hope you enjoy!

1. Data engineering is fundamentally more important than data science.

2. The demand for data engineers is growing… by a lot.

3. Data engineering skills are extremely useful as a data scientist.

#2021 apr opinions #career advice #data engineer #data engineering #data science #data scientist