1675580220
News Feed is a news app powered by NewsAPI.org which shows latest news and categorized news based on user location.
Libraries used
git clone https://github.com/KevinGitonga/NewsFeed.git
News Feed can be download from our releases section (https://github.com/KevinGitonga/NewsFeed/blob/master/app/release/app-release.apk):
Author: KevinGitonga
Source Code: https://github.com/KevinGitonga/NewsFeed
License: Apache-2.0 license
1675466400
The most straightforward, extensible and incredibly fast state management that is based on React state hook.
Hookstate is a modern alternative to Redux, Mobx, Recoil, etc. It is simple to learn, easy to use, extensible, very flexible and capable to address all state management needs of large scalable applications. It has got impressive performance and predictable behavior.
Any questions? Just ask by raising a GitHub ticket.
hookstate.js.org/docs/migrating-to-v4
hookstate.js.org/docs/getting-started
hookstate.js.org/docs/devtools
hookstate.js.org/docs/extensions-overview
hookstate.js.org/docs/typedoc-hookstate-core
This is the mono repository, which combine the Hookstate core package, extensions, docs and demo applications. pnpm
is used as node_modules manager and nx
as a scripts launcher. Each package defines its own rules how to build, test, etc.
From the repository root directory:
npm install -f pnpm
- install pnpm tool
pnpm install
- install node_modules for all packages
pnpm nx <script> <package>
- run script for a package as well as build dependencies if required, for example:
pnpm nx build core
- run build
script for core
packagepnpm nx start todolist
- run start
script for todolist
package as well as build for all dependenciesAuthor: Avkonst
Source Code: https://github.com/avkonst/hookstate
License: MIT license
1675383420
Libbra is a sample app that allows to track currency exchanges. This app presents modern approach to Android application development using Kotlin and latest tech-stack.
This project is a hiring task by Revolut. The goal of the project is to demonstrate best practices, provide a set of guidelines, and present modern Android application architecture that is modular, scalable, maintainable and testable. This application may look simple, but it has all of these small details that will set the rock-solid foundation of the larger app suitable for bigger teams and long application lifecycle management.
First off, you require the latest Android Studio 3.6.0 (or newer) to be able to build the app.
Moreover, to sign your app for release, please refer to keystore.properties
to find required fields.
# Signing Config
signing.store.password=<look>
signing.key.password=<look>
signing.key.alias=<look>
signing.store.file=<look>
To maintain the style and quality of the code, are used the bellow static analysis tools. All of them use properly configuration and you find them in the project root directory .{toolName}
.
Tools | Config file | Check command | Fix command |
---|---|---|---|
detekt | default-detekt-config | ./gradlew detekt | - |
ktlint | - | ./gradlew ktlintCheck | ./gradlew ktlintFormat |
spotless | /spotless | ./gradlew spotlessCheck | ./gradlew spotlessApply |
lint | /.lint | ./gradlew lint | - |
All these tools are integrated in pre-commit git hook, in order ensure that all static analysis and tests passes before you can commit your changes. To skip them for specific commit add this option at your git command:
git commit --no-verify
The pre-commit git hooks have exactly the same checks as Github Actions and are defined in this script. This step ensures that all commits comply with the established rules. However the continuous integration will ultimately be validated that the changes are correct.
App support different screen sizes and the content has been adapted to fit for mobile devices and tablets. To do that, it has been created a flexible layout using one or more of the following concepts:
In terms of design has been followed recommendations android material design comprehensive guide for visual, motion, and interaction design across platforms and devices. Granting the project in this way a great user experience (UX) and user interface (UI). For more info about UX best practices visit link.
Moreover, has been implemented support for dark theme with the following benefits:
The architecture of the application is based, apply and strictly complies with each of the following 5 points:
Modules are collection of source files and build settings that allow you to divide a project into discrete units of functionality. In this case apart from dividing by functionality/responsibility, existing the following dependence between them:
The above graph shows the app modularisation:
:app
depends on :rules
.:rules
depends on nothing.The :app
module is an com.android.application, which is needed to create the app bundle. It is also responsible for initiating the dependency graph and another project global libraries, differentiating especially between different app environments.
The :rules
module is an com.android.library, basically contains lint checks for the entire project.
Ideally, ViewModels shouldn’t know anything about Android. This improves testability, leak safety and modularity. ViewModels have different scopes than activities or fragments. While a ViewModel is alive and running, an activity can be in any of its lifecycle states. Activities and fragments can be destroyed and created again while the ViewModel is unaware.
Passing a reference of the View (activity or fragment) to the ViewModel is a serious risk. Lets assume the ViewModel requests data from the network and the data comes back some time later. At that moment, the View reference might be destroyed or might be an old activity that is no longer visible, generating a memory leak and, possibly, a crash.
The communication between the different layers follow the above diagram using the reactive paradigm, observing changes on components without need of callbacks avoiding leaks and edge cases related with them.
This project takes advantage of many popular libraries, plugins and tools of the Android ecosystem. Most of the libraries are in the stable version, unless there is a good reason to use non-stable dependency.
Author: Nuhkoca
Source Code: https://github.com/nuhkoca/libbra
License: Apache-2.0 license
1675347919
A playground android app, showcasing the latest technologies and architecture patterns using the Movie Database APIs.
Demo
Technologies
Architecture
A custom architecture inspired by the Google MVVM and the Clean architecture.
This architecture allows app to be offline first. It gets data from the network if it doesn't exist in the local database and persists it. Local database is the single source of truth of the app and after its data changes, it notifies other layers using coroutine flows.
Build
Clone the repository and get an API key from the Movie Database and put it in the local.properties
file as below:
apikey="YOUR_API_KEY"
Author: ImnIrdst
Source Code: https://github.com/ImnIrdst/iiCnma
License: MIT license
1675323540
Alkaa (begin, start in Finnish) is a to-do application project to study the latest components, architecture and tools for Android development. The project evolved a lot since the beginning is available on Google Play! :heart:
The current version of Alkaa was also completely migrate to Jetpack Compose!
One of the main goals of Alkaa is too use all the latest libraries and tools available.
For more dependencies used in project, please access the Dependency File
If you want to check the previous version of Alkaa, please take a look at the last V1 release
Alkaa architecture is strongly based on the Hexagonal Architecture by Alistair Cockburn. The application also relies heavily in modularization for better separation of concerns and encapsulation.
Let's take a look in each major module of the application:
This type of architecture protects the most important modules in the app. To achieve this, all the dependency points to the center, and the modules are organized in a way that the more the module is in the center, more important it is.
To better represents the idea behind the modules, here is a architecture image representing the flow of dependency:
Author: igorescodro
Source Code: https://github.com/igorescodro/alkaa
License: Apache-2.0 license
1674828744
Azure architecture icons help us to build a custom architecture diagram for our custom designs and solutions for the Customers.
Follow the below steps to download the Official Microsoft Azure Architecture Icons.
Step 1
Click on the link Azure icons – Azure Architecture Center | Microsoft Learn.
Step 2
On Azure architecture icons page, Select the I agree to the above terms Check box and click on Download SVG icons.
Step 3
All the Azure Icons will be downloaded as a Zip folder in the Downloads folder. Unzip the folder and have a look at all the icons used in Microsoft Azure Products in the Icons folder.
Note: Check for the Icon updates in the article and download the latest icons, whenever it is required.
Hope you have successfully downloaded Microsoft Azure Architecture icons.
Like and share your valuable feedback on this blog.
Original article source at: https://www.c-sharpcorner.com/
1673996760
A lot of developers need to change navigation flow frequently, because it depends on business tasks. And they spend a huge amount of time for re-writing code. In this approach, I demonstrate our implementation of Coordinators, the creation of a protocol-oriented, testable architecture written on pure Swift without the downcast and, also to avoid the violation of the S.O.L.I.D. principles.
Example provides very basic structure with 6 controllers and 5 coordinators with mock data and logic.
I used a protocol for coordinators in this example:
protocol Coordinator: class {
func start()
func start(with option: DeepLinkOption?)
}
All flow controllers have a protocols (we need to configure blocks and handle callbacks in coordinators):
protocol ItemsListView: BaseView {
var authNeed: (() -> ())? { get set }
var onItemSelect: (ItemList -> ())? { get set }
var onCreateButtonTap: (() -> ())? { get set }
}
In this example I use factories for creating coordinators and controllers (we can mock them in tests).
protocol CoordinatorFactory {
func makeItemCoordinator(navController navController: UINavigationController?) -> Coordinator
func makeItemCoordinator() -> Coordinator
func makeItemCreationCoordinatorBox(navController: UINavigationController?) ->
(configurator: Coordinator & ItemCreateCoordinatorOutput,
toPresent: Presentable?)
}
The base coordinator stores dependencies of child coordinators
class BaseCoordinator: Coordinator {
var childCoordinators: [Coordinator] = []
func start() { }
func start(with option: DeepLinkOption?) { }
// add only unique object
func addDependency(_ coordinator: Coordinator) {
for element in childCoordinators {
if element === coordinator { return }
}
childCoordinators.append(coordinator)
}
func removeDependency(_ coordinator: Coordinator?) {
guard
childCoordinators.isEmpty == false,
let coordinator = coordinator
else { return }
for (index, element) in childCoordinators.enumerated() {
if element === coordinator {
childCoordinators.remove(at: index)
break
}
}
}
}
AppDelegate store lazy reference for the Application Coordinator
var rootController: UINavigationController {
return self.window!.rootViewController as! UINavigationController
}
private lazy var applicationCoordinator: Coordinator = self.makeCoordinator()
func application(_ application: UIApplication,
didFinishLaunchingWithOptions launchOptions: [UIApplicationLaunchOptionsKey: Any]?) -> Bool {
let notification = launchOptions?[.remoteNotification] as? [String: AnyObject]
let deepLink = DeepLinkOption.build(with: notification)
applicationCoordinator.start(with: deepLink)
return true
}
private func makeCoordinator() -> Coordinator {
return ApplicationCoordinator(
router: RouterImp(rootController: self.rootController),
coordinatorFactory: CoordinatorFactoryImp()
)
}
Based on the post about Application Coordinators khanlou.com and Application Controller pattern description martinfowler.com.
Author: AndreyPanov
Source Code: https://github.com/AndreyPanov/ApplicationCoordinator
License: MIT license
1673518620
A dependency management library inspired by SwiftUI's “environment.”
This library was motivated and designed over the course of many episodes on Point-Free, a video series exploring functional programming and the Swift language, hosted by Brandon Williams and Stephen Celis.
Dependencies are the types and functions in your application that need to interact with outside systems that you do not control. Classic examples of this are API clients that make network requests to servers, but also seemingly innocuous things such as UUID
and Date
initializers, file access, user defaults, and even clocks and timers, can all be thought of as dependencies.
You can get really far in application development without ever thinking about dependency management (or, as some like to call it, "dependency injection”), but eventually uncontrolled dependencies can cause many problems in your code base and development cycle:
Uncontrolled dependencies make it difficult to write fast, deterministic tests because you are susceptible to the vagaries of the outside world, such as file systems, network connectivity, internet speed, server uptime, and more.
Many dependencies do not work well in SwiftUI previews, such as location managers and speech recognizers, and some do not work even in simulators, such as motion managers, and more. This prevents you from being able to easily iterate on the design of features if you make use of those frameworks.
Dependencies that interact with 3rd party, non-Apple libraries (such as Firebase, web socket libraries, network libraries, etc.) tend to be heavyweight and take a long time to compile. This can slow down your development cycle.
For these reasons, and a lot more, it is highly encouraged for you to take control of your dependencies rather than letting them control you.
But, controlling a dependency is only the beginning. Once you have controlled your dependencies, you are faced with a whole set of new problems:
How can you propagate dependencies throughout your entire application in a way that is more ergonomic than explicitly passing them around everywhere, but safer than having a global dependency?
How can you override dependencies for just one portion of your application? This can be handy for overriding dependencies for tests and SwiftUI previews, as well as specific user flows such as onboarding experiences.
How can you be sure you overrode all dependencies a feature uses in tests? It would be incorrect for a test to mock out some dependencies but leave others as interacting with the outside world.
This library addresses all of the points above, and much, much more. Explore all of the tools this library comes with by checking out the documentation, and reading these articles:
Quick start: Learn the basics of getting started with the library before diving deep into all of its features.
What are dependencies?: Learn what dependencies are, how they complicate your code, and why you want to control them.
Using dependencies: Learn how to use the dependencies that are registered with the library.
Registering dependencies: Learn how to register your own dependencies with the library so that they immediately become available from any part of your code base.
Live, preview, and test dependencies: Learn how to provide different implementations of your dependencies for use in the live application, as well as in Xcode previews, and even in tests.
Designing dependencies: Learn techniques on designing your dependencies so that they are most flexible for injecting into features and overriding for tests.
Overriding dependencies: Learn how dependencies can be changed at runtime so that certain parts of your application can use different dependencies.
Dependency lifetimes: Learn about the lifetimes of dependencies, how to prolong the lifetime of a dependency, and how dependencies are inherited.
Single entry point systems: Learn about "single entry point" systems, and why they are best suited for this dependencies library, although it is possible to use the library with non-single entry point systems.
We rebuilt Apple's Scrumdinger demo application using modern, best practices for SwiftUI development, including using this library to control dependencies on file system access, timers and speech recognition APIs. That demo can be found in our SwiftUINavigation library.
The latest documentation for the Dependencies APIs is available here.
You can add Dependencies to an Xcode project by adding it to your project as a package.
If you want to use Dependencies in a SwiftPM project, it's as simple as adding it to your Package.swift
:
dependencies: [
.package(url: "https://github.com/pointfreeco/swift-dependencies", from: "0.1.0")
]
And then adding the product to any target that needs access to the library:
.product(name: "Dependencies", package: "swift-dependencies"),
This library controls a number of dependencies out of the box, but is also open to extension. The following projects all build on top of Dependencies:
There are many other dependency injection libraries in the Swift community. Each has its own set of priorities and trade-offs that differ from Dependencies. Here are a few well-known examples:
Author: Pointfreeco
Source Code: https://github.com/pointfreeco/swift-dependencies
License: MIT license
1672888800
The simplest architecture for RxSwift
typealias Feedback<State, Event> = (Observable<State>) -> Observable<Event>
public static func system<State, Event>(
initialState: State,
reduce: @escaping (State, Event) -> State,
feedback: Feedback<State, Event>...
) -> Observable<State>
Why
Straightforward
Declarative
Debugging is easier
Can be applied on any level
system
operator)system
operator inside feedback loop)Works awesome with dependency injection
Testing
Can model circular dependencies
Completely separates business logic from effects (Rx).
Examples
Observable.system(
initialState: 0,
reduce: { (state, event) -> State in
switch event {
case .increment:
return state + 1
case .decrement:
return state - 1
}
},
scheduler: MainScheduler.instance,
feedback:
// UI is user feedback
bind(self) { me, state -> Bindings<Event> in
let subscriptions = [
state.map(String.init).bind(to: me.label.rx.text)
]
let events = [
me.plus.rx.tap.map { Event.increment },
me.minus.rx.tap.map { Event.decrement }
]
return Bindings(
subscriptions: subscriptions,
events: events
)
}
)
Simple automatic feedback loop.
Observable.system(
initialState: State.humanHasIt,
reduce: { (state: State, event: Event) -> State in
switch event {
case .throwToMachine:
return .machineHasIt
case .throwToHuman:
return .humanHasIt
}
},
scheduler: MainScheduler.instance,
feedback:
// UI is human feedback
bindUI,
// NoUI, machine feedback
react(request: { $0.machinePitching }, effects: { (_) -> Observable<Event> in
return Observable<Int>
.timer(1.0, scheduler: MainScheduler.instance)
.map { _ in Event.throwToHuman }
})
)
Driver.system(
initialState: State.empty,
reduce: State.reduce,
feedback:
// UI, user feedback
bindUI,
// NoUI, automatic feedback
react(request: { $0.loadNextPage }, effects: { resource in
return URLSession.shared.loadRepositories(resource: resource)
.asSignal(onErrorJustReturn: .failure(.offline))
.map(Event.response)
})
)
Run RxFeedback.xcodeproj
> Example
to find out more.
Installation
CocoaPods is a dependency manager for Cocoa projects. You can install it with the following command:
$ gem install cocoapods
To integrate RxFeedback into your Xcode project using CocoaPods, specify it in your Podfile
:
pod 'RxFeedback', '~> 3.0'
Then, run the following command:
$ pod install
Carthage is a decentralized dependency manager that builds your dependencies and provides you with binary frameworks.
You can install Carthage with Homebrew using the following command:
$ brew update
$ brew install carthage
To integrate RxFeedback into your Xcode project using Carthage, specify it in your Cartfile
:
github "NoTests/RxFeedback" ~> 3.0
Run carthage update
to build the framework and drag the built RxFeedback.framework
into your Xcode project. As RxFeedback
depends on RxSwift
and RxCocoa
you need to drag the RxSwift.framework
and RxCocoa.framework
into your Xcode project as well.
The Swift Package Manager is a tool for automating the distribution of Swift code and is integrated into the swift
compiler.
Once you have your Swift package set up, adding RxFeedback as a dependency is as easy as adding it to the dependencies
value of your Package.swift
.
dependencies: [
.package(url: "https://github.com/NoTests/RxFeedback.swift.git", majorVersion: 1)
]
Difference from other architectures
Cmd
, which effects to perform are encoded into state and queried by feedback loopsAuthor: NoTests
Source Code: https://github.com/NoTests/RxFeedback.swift
License: MIT license
1672814609
Data is the new oil. And hence managing data is of utmost importance for any enterprise. With the huge amount of data that is generated for a market now and to provide superior performance over them, NoSQL databases are now ruling the tech industry. Within the numerous NoSQL databases in the market, this emerging one is catching the attention of numerous techies and businesses. Marklogic will definitely be having very prosperous future.
In a single sentence, Marklogic is an enterprise NoSQL multi-model database management system. Let’s now break down the above sentence to get a clearer picture.
Marklogic is basically a clustered database that has multiple nodes running. The following is a layered structure inside 1 node.
Let’s understand the different layers in detail in the bottom-up approach.
At the bottom, we have the data layer, and at the bottom of that, there is the storage system for storing the data. It is multi-model, so there is a different kind of storage. It can store compressed text like json and XML, and we can understand the structure of those documents at this level. We have binary for storing images and videos, semantic for semantic triples, and semantic relationships.
Next, we have an extensive set of indexes, consisting of the main full-text index. It also has other specialized indexes like geospatial, scalar, semantic, relational, etc. It also has a security index at this level. All data in Marklogic is mediated through the security index as Marklogic provides security in the most fundamental level of data access to Marklogic.
Caches – Provides efficient access to data storage and data on disk.
Journal – The data in Marklogic, i.e the compressed data and the indexes are written in batches. So first a journal entry is made and the data is committed to disk. So in a case of disaster, before we commit the batch data efficiently, we still have the committed journal record we can start up and get back to a good known state and maintain a consistent state.
Transaction Controller – It handles all the above, mediating transactions across the cluster. It follows acid properties, so in case of even very complicated transactions, it will make it to all the nodes in the cluster together or not at all.
Broadcaster – At the base of the query layer is the broadcaster, federating queries across the cluster and to multiple threads within this node in the cluster.
Aggregator – Consolidates those partial results into a complete resultset.
Caches – Used to cache the queries that are executed frequently.
Evaluator – There are multiple evaluators in Marklogic, the 2 main ones being Javascript (for json) and xquery (for xml), as well as other specialized evaluators for more specific data formats like SQL for relational data. Supporting all these evaluators is an extensive library of functions that help them to make them even more capable.
The interface to these is Http rest endpoints. There is an extensive collection of endpoints to felicitate search documents, crud operations, administration, etc. We can define new endpoints can be defined as per business requirements for the required data services.
If we are dealing with java/nodeJs there are client APIs that provide access to the same set of services. We can take our own endpoint specifications and compile them so that again the developers here can access them in an idiomatic way. If any other languages like python or shell script, we can just call rest HTTP in the normal way.
How does it fit in the bigger picture? Marklogic is a distributed DB with multiple nodes in the cluster. The above diagram is just 1 node. Each node could be just a data layer, query layer and interface, or a combination. This can be deployed on-premise or on the cloud. We could use one of the services in the cloud. Like if we are using the query service, what we have is an elastic pool of nodes focused just on the query layer that scales to our workload. If we are using the datahub service, we have a full-stack application that is dedicated to helping us integrate data. That’s the MarkLogic server and this is how it fits into our world.
Original article source at: https://blog.knoldus.com/
1671713945
"By 2019, data and analytics companies that have agile curated internal and external datasets for a variety of content writers would recognize twice the market benefits of those that do not," according to the study. On the other hand, organizations continue to fail to comprehend the importance of metadata management and cataloging. Given that data unification and collaboration are becoming increasingly important success factors for businesses, it's worth revisiting the data catalog and its advantages for the entire enterprise, as it will soon become the pillar of the data-driven strategy. Data Catalog is a comprehensive list of all data assets in an organization intended to assist data professionals in rapidly locating the most suitable data for any analytical or business purpose.
A data catalog is a list of all the data that an entity has. It is a library where data is indexed, organized, and stored for an entity. Most data catalogs provide data sources; data use information, and data lineage, explaining where the data came from and how it involved into its current state. Organizations may use a data catalog to centralize information, classify what they have, and separate data based on its content and source. A data catalog's goal is to help you understand your data and learn what you didn't know before.
Some Important points
As a result, ensure you don't leave any data out of the catalog. Your Big Data activities can also include a data cataloging service.
Make it a part of your daily routine rather than a separate task. Align the data plan with the catalog.
Set accessibility rules to avoid unauthorized data access.
Listed below are the reasons Why Data Catalog is important:
Data catalog scan by facets, keywords, and business terms with robust search capabilities. Non-technical users can appreciate the ability to search using natural language. The ability to rank search results based on relevance and frequency of use is beneficial and advantageous.
It provides the ability to assess a dataset's suitability for an analysis use case without having to download or procure data first is critical. Previewing a dataset, seeing all related metadata, seeing user ratings, reading user reviews and curator annotations, and viewing data quality information are important evaluation features.
It helps in its journey from search to assessment to data access should be a smooth one, with the catalog understanding access protocols and having direct access or collaborating with access technologies. Access safeguards for confidentiality, privacy, and enforcement of sensitive data are among the data access functions.
Today's data production must scale to accommodate massive data volumes and high-performance computing. To adapt to data, technology, and consumer needs, it must be versatile and resilient. It must ensure that essential data information is readily available for customers to access and comprehend. It must be able to handle all data speeds, from streaming to batch ETL (Extract, Transform, and Load). It should be able to handle all forms of data, from relational to unstructured and semi-structured. It must allow all data users access to data while still protecting confidential data, and none of this is possible without metadata.
They are connecting to the necessary data source. Data from within the company and from outside sources are examples of sources. Relationally structured, semi-structured, multi-structured, and unstructured data are all included.
Including data in the analytics process. Batch and real-time ingestion methods are available, ranging from batch ETL to data stream processing. Scalability and elasticity are critical for adapting to data volumes and speed changes.
Data lakes, data centers, and master data/reference data hubs are all examples of shareable data stores. The data refinery is in charge of data cleansing, integration, aggregation, and other forms of data transformations.
Access to data is provided in various ways, including query, data virtualization, APIs, and data services, for both people and the applications and algorithms that use it.
Turning data into information and insights includes basic reporting to data science, artificial intelligence, and machine learning.
Data consumption is the point at which data and people become inextricably linked. Data consumption aims to get from data and observations to decisions, behavior, and effects.
All data catalogs are not created equal. It's critical to filter players based on key capabilities when selecting a data catalog. As a result, many data catalogs, including Talend Data Catalog, depend on critical components that will ensure your data strategy's effectiveness. Let's take a look at some of the essential features:
The data catalog's ability to map physical datasets in your dataset, regardless of their origin or source, is enhanced by having many connectors. You can extract metadata from business intelligence software, data integration tools, SQL queries, enterprise apps like Salesforce or SAP, or data modeling tools using powerful capabilities, allowing you to onboard people to verify and certify your datasets for extended use.
Data stewards won't have to waste time manually linking data sources thanks to improved automation. They'll then concentrate on what matters most: fixing data quality problems and curating them for the whole company's good. Of course, you'll need the support of stewards to complement automation – to enrich and curate datasets over time.
The quest should be multifaceted as the primary component of a catalog, allowing you to assign various criteria to perform an advanced search. Search parameters include names, height, time, owner, and format.
Lineage allows you to link a dashboard to the data it displays. Understanding the relationship between various forms and data sources relies heavily on lineage and relationship exploration. So, if your dashboard shows erroneous data, a steward may use the lineage to determine where the issue is.
The ability to federate people around the data is essential for governance. To do so, they must have a shared understanding of words, definitions, and how to relate them to the data. As a result, the glossary is helpful. If you look for PII in a data catalog, you'll find the following data sources: It's especially useful in the context of GDPR (General Data Protection Regulation), where you need to take stock of all the data you have.
When linking multiple data sources, data profiling is essential for determining your data quality in terms of completeness, accuracy, timeliness, and consistency. It will save time and enable you to spot inaccuracies quickly, allowing you to warn stewards before polluting the data lake.
The whole company gains when data professionals can help themselves to the data they need without IT interference, without relying on experts or colleagues for guidance, without being limited to just the assets they are familiar with, and without having to worry about governance enforcement.
Analysts can find comprehensive explanations of data, including input from other data citizens, and understand how data is important to the company.
A data catalog establishes an efficient division of labor between users and IT—data people can access and interpret data more quickly. At the same time, IT workers can concentrate on higher-priority tasks.
Analysts may be more confident that they're dealing with data they've been granted permission to use for a specific reason and that they're following business and data privacy regulations. They can also quickly scan annotations and metadata for null fields or incorrect values that might skew the results.
It is difficult for data analysts to identify, view, plan, and trust data, the less likely BI and big data projects will be successful.
The Data professionals will respond rapidly to the problems, opportunities, and challenges with analysis and answers based on all of the company's most appropriate, contextual data.
A data catalog will also assist the company in achieving particular technological and business goals. A data catalog can help discover new opportunities for cross-selling, up-selling, targeted promotions, and more by supplying analysts with a holistic view of their customers.
Metadata is a thread that connects all other building materials, including ways for ingestion to be aware of sources, refinement to be connected to ingestion, and so on. Every component of the architecture contributes to the development and use of metadata.
Collects information on data flow across data pipelines and all data flow changes. This involves all data pipelines that send data to data lakes, warehouses, and processing pipelines.
This metadata, which is derived from data perception, offers lineage information critical for accurate data and a helpful tool for tracking and troubleshooting issues.
Consuming Data
This allows for collecting metadata on who uses what data, what kinds of use cases are used, and what effect the data has on the enterprise. Data processing and data-driven cultures are built on a deep understanding of data users and their data dependencies.
Everyone dealing with data should know the amount of knowledge available on data policy, preparation, and management.
It is founded on data understanding, processing systems, data uses, and consumers.
Data collection systems are combined, and data processing processes are supported as information is managed as metadata in the data catalog.
Data-driven organizations are a goal for many businesses. They want more accurate, quicker analytics without losing security. That is why data processing is becoming increasingly necessary and challenging. A data catalog makes data storage easy to handle and meets various demands. It's challenging to manage data in the era of big data, data lakes, and self-service. Data catalog assists in meeting those difficulties. Active data curation is a vital digital data processing method and a key component of data catalog performance.
Read more about the Top 9 Challenges of Big Data Architecture | Overview
Click to explore What is Data Observability?
Original article source at: https://www.xenonstack.com/
1670613420
Data mesh builds a layer of connectivity that takes away the complexities of connecting, managing, and supporting data access. It is a way to fasten the data together that is held across multiple data silos. It combines the data distributed data across different locations and organizations. It provides data that is highly available, easily discoverable, and secure. It is beneficial in an organization where a team generates data from many data-driven use cases and access patterns in it.
We can use it, like when we need to connect cloud applications to sensitive data that lives in a customer's cloud environment. Also, when we need to create virtual data catalogs obtained from various data sources that can't be centralized. There is also a situation in which it is used, for instance, when we create virtual data warehouses or data lakes for analytics and ML training that can be done without consolidating data into a single repository.
It is a fully managed service mesh that is used for complex microservices architectures. It is a suite of tools that monitor and manage a reliable service mesh on-premises or Google Cloud. It's powered by Istio, which is a highly configurable and one of the powerful open-source service mesh platforms that have tools and features that enable industry best practices. It defines and manages configuration centrally at a higher level. It is deployed as a uniform layer across the full infrastructure. Service developers and operators can use a rich feature set without making a single change to the application code.
Anthos Service Mesh relies on Google Kubernetes Engine (GKE ) GKE On-Premise Observability features. Microservices architectures provide many benefits, but on the other hand, there are challenges like added complexity and fragmentation for different workloads. It solves the problem like it unburdens operations and development teams by simplifying service delivery across the board, from traffic management and mesh telemetry to securing communications between services.
Here are some of the features of Anthos Service Mesh
Anthos Service Mesh is integrated with Cloud Logging, Cloud Monitoring, and Cloud Trace that provides many benefits, such as monitoring SLOs at a per-service level and setting targets for latency and availability.
Anthos Service Mesh ensures easy authentication and encryption. It transport authentication through MTLS (Mutual Transport Layer Security) has never been more effortless. It secures service-to-service as well as end-user-to-service communications with just a one-click mTLS installation or incremental implementation.
It provides flexible authorization like we only need to specify the permissions after that grant access to them at the level that we choose, from namespace down to users.
Anthos Service Mesh opens up many traffic management features as it decouples traffic flow from infrastructure scaling and includes dynamic requests. Routing for A/B testing, canary deployments, and gradual rollout, and that also all outside of your application code.
It provides many critical failure-recovery features out of the box, to configure dynamically at runtime.
Azure Service Fabric Mesh helps the developers deploy microservices applications, and there is no need to manage virtual machines, storage, or networking. The applications hosted on Service Fabric Mesh run and scale without worrying about the infrastructure powering it. Service Fabric Mesh has clusters of many machines, and every one of these cluster operations is hidden from the developer.
You only need to upload the code and mention the resources we need, availability requirements, and resource limits. It automatically allocates the infrastructure and handles infrastructure failures as well, and we need to make sure the applications are highly available. We need to take care of the health and responsiveness of the application and not the infrastructure. Azure Service Fabric has three public offerings: Service Fabric Azure Cluster service, Service Fabric Standalone, and Azure Service Fabric Mesh service.
AWS App Mesh helps to run services by providing consistent visibility and network traffic controls. For services built across multiple computing infrastructure types. App Mesh abolishes the necessity to update the application code. To vary how monitoring data is collected or traffic is routed between services. It configures each service to export monitoring data and implements consistent communications control logic across your application. When any failure occurs or when code changes must be deployed, therein situation makes it easy. To pinpoint the precise location of errors quickly and automatically reroute network traffic.
Following are the advantages of AWS App Mesh: Provides End-to-end visibility because it captures metrics, logs, and traces from all of your applications. We can combine and export this data to Amazon CloudWatch, AWS X-Ray, and community tools for monitoring, helping to quickly identify and isolate issues with any service to optimize your entire application.
App Mesh gives controls to configure how traffic flows between your services. Implement easily custom traffic routing rules to ensure that every service is highly available during deployments, after failures, and as your application scales.
App Mesh configures and deploys a proxy that manages all communications traffic to and from your services. This removes the requirement to configure communication protocols for every service, write custom code, or implement libraries to control the application.
Users can use App Mesh with services running on any compute services like AWS Fargate, Amazon EKS, Amazon ECS, and Amazon EC2. App Mesh can also monitor and control communications for monoliths running on EC2. Teams running containerized applications, orchestration systems, or VPCs as one application with no code changes.
To configure a service mesh for applications deployed on-premises, we can use AWS App Mesh on AWS Outposts. AWS Outposts could be a fully managed service that extends AWS infrastructure, AWS services, APIs, and tools to virtually any connected site. With AWS App Mesh on Outposts, you'll provide consistent communication control logic. For services across AWS Outposts and AWS cloud to simplify hybrid application networking.
Given below are the differences between Data Mesh and Data Lake.
A data mesh allows the organization to escape the analytical and consumptive confines of monolithic data architectures and connects siloed data. To enable machine learning and automated analytics at scale. It allows the company to be data-driven and give up data lakes and data warehouses. It replaces them with the power of data access, control, and connectivity.
Original article source at: https://www.xenonstack.com/
1670593211
Apache Pulsar is a multi-tenant, high-performance server to server messaging system. Yahoo developed it. In late 2016 it was a first open-source project. Now it is in the incubation, under the Apache Software Foundation(ASF). Pulsar works on the pub-sub pattern, where there is a Producer, and a Consumer also called the subscribers, the topic is the core of the pub-sub model, where producer publish their messages on a given pulsar topic, and consumer subscribes to a problem to get news from that topic and send an acknowledgement.
Once a subscription has been acknowledged, all the messages will be retained by the pulsar. One Consumer acknowledged has been processed only after that message gets deleted.Apache Pulsar Topics: are well defined named channels for transmitting messages from producers to consumers. Topics names are well-defined URL.
Namespaces: It is logical nomenclature within a tenant. A tenant can create multiple namespaces via admin API. A namespace allows the application to create and manage a hierarchy of topics. The number of issues can be created under the namespace.
A subscription is a named rule for the configuration that determines the delivery of the messages to the consumer. There are three subscription modes in Apache Pulsar
In Exclusive mode, only a single consumer is allowed to attach to the subscription. If more then one consumer attempts to subscribe to a topic using the same subscription, then the consumer receives an error. Exclusive mode as default is subscription model.
In failover, multiple consumers attached to the same topic. These consumers are sorted in lexically with names, and the first consumer is the master consumer, who gets all the messages. When a master consumer gets disconnected, the next consumers will get the words.
Shared and round-robin mode, in which a message is delivered only to that consumer in a round-robin manner. When that user is disconnected, then the messages sent and not acknowledged by that consumer will be re-scheduled to other consumers. Limitations of shared mode-
The process used for analyzing the huge amount of data at the moment it is used or produced. Click to explore about our, Real Time Data Streaming Tools
The routing modes determine which partition to which topic a message will be subscribed. There are three types of routing methods. When using partitioned questions to publish, routing is necessary.
If no key is provided to the producer, it will publish messages across all the partitions available in a round-robin way to achieve maximum throughput. Round-robin is not done per individual message but set to the same boundary of batching delay, and this ensures effective batching. While if a key is specified on the message, the producer that is partitioned will hash the key and assign all the messages to the particular partition. This is the default mode.
If no key is provided, the producer randomly picks a single partition and publish all the messages in that particular partition. While if the key is specified for the message, the partitioned producer will hash the key and assign the letter to the barrier.
The user can create a custom routing mode by using the java client and implementing the MessageRouter interface. Custom routing will be called for a particular partition for a specific message.
Pulsar cluster consists of different parts in it: In pulsar, there may be one more broker’s handles, and load balances incoming messages from producers, it dispatches messages to consumers, communicates with the pulsar configuration store to handle various coordination tasks. It stores messages in BookKeeper instances.
The broker is a stateless component that handles an HTTP server and the Dispatcher. An HTTP server exposes a Rest API for both administrative tasks and topic lookup for producers and consumers. A dispatcher is an async TCP server over a custom binary protocol used for all data transfers.
A Pulsar instance usually consists of one or more Pulsar clusters. It consists of: One or more brokers, a zookeeper quorum used for cluster-level configuration and coordination and an ensemble of bookies used for persistent storage of messages.
Pulsar uses apache zookeeper to store the metadata storage, cluster config and coordination.
Pulsar provides surety of message delivery. If a message reaches a Pulsar broker successfully, it will be delivered to the target that’s intended for it.
Pulsar has client API’s with language Java, Go, Python and C++. The client API encapsulates and optimizes pulsar’s client-broker communication protocol. It also exposes a simple and intuitive API for use by the applications. The current official Pulsar client libraries support transparent reconnection, and connection failover to brokers, queuing of messages until acknowledged by the broker, and these also consists of heuristics such as connection retries with backoff.
When an application wants to create a producer/consumer, the pulsar client library will initiate a setup phase that is composed of two setups:
Apache Pulsar’s Geo-replication enables messages to be produced in one geolocation and can be consumed in other geolocation. In the above diagram, whenever producers P1, P2, and P3 publish a message to the given topic T1 on Cluster – A, B and C respectively, all those messages are instantly replicated across clusters. Once replicated, this allows consumers C1 & C2 to consume the messages from their respective groups. Without geo-replication, C1 and C2 consumers are not able to consume messages published by P3 producers.
Pulsar was created from the group up as a multi-tenant system. Apache supports multi-tenancy. It is spread across a cluster, and each can have their authentication and authorization scheme applied to them. They are also the administrative unit at which storage, message Ttl, and isolation policies can be managed.
To each tenant in a particular pulsar instance you can assign:
The Dataset is a data structure in Spark SQL which is strongly typed, Object-oriented and is a map to a relational schema.Click to explore about our, RDD in Apache Spark Advantages
Pulsar has support for the authentication mechanism which can be configured at the broker, and it also supports authorization to identify the client and its access rights on topics and tenants.
Pulsar’s architecture allows topic backlogs to grow very large. This makes a rich set of the situation over time. To alleviate this cost is to use Tiered Storage. The Tiered Storage move older messages in the backlog can be moved from BookKeeper to cheaper storage. Which means clients can access older backlogs.
Type safety is paramount in communication between the producer and the consumer in it. For safety in messaging, pulsar adopted two basic approaches:
In this approach message producers and consumers are responsible for not only serializing and deserializing messages (which consist of raw bytes) but also “knowing” which types are being transmitted via which topics.
In this approach which producers and consumers inform the system which data types can be transmitted via the topic. With this approach, the messaging system enforces type safety and ensures that both producers and consumers remain in sync.
Pulsar schema is applied and enforced at the topic level. Producers and consumers upload schemas to pulsar are asked. Pulsar schema consists of :
It supports the following schema formats:
If no schema is defined, producers and consumers handle raw bytes.
The pros and cons of Apache Pulsar are described below:
S.No. | Kafka | Apache Pulsar |
1 | It is more mature and higher-level APIs. | It incorporated improved design stuff of Kafka and its existing capabilities. |
2 | Built on top of Kafka Streams | Unified messaging model and API.
|
3 | Producer-topic-consumer group-consumer | Producer-topic-subscription-consumer |
4 | Restricts fluidity and flexibility | Provide fluidity and flexibility |
5 | Messages are deleted based on retention. If a consumer doesn’t read words before the retention period, it will lose data. | Messages are only deleted after all subscriptions consumed them. No data loss, even the consumers of a subscription are down for a long time. Words are allowed to keep for a configured retention period time even after all subscriptions consume them. |
Drawbacks of Kafka
Even though it looks like Kafka lags behind pulsar, but kip (Kafka improvement proposals) has almost all of these drawbacks covered in its discussion and users can hope to see the changes in the upcoming versions of the Kafka.
Kafka To Pulsar – User can easily migrate to Pulsar from Kafka as Pulsar natively supports to work directly with Kafka data through connectors provided or one can import Kafka application data to pulsar quite easily.
Pulsar SQL uses Presto to query over the old messages that are kept in backlog (Apache BookKeeper).
Apache Pulsar is a powerful stream-processing platform that has been able to learn from the previously existing systems. It has a layered architecture which is complemented by the number of great out-of-the-box features like multi-tenancy, zero rebalancing downtime,geo-replication, proxy and durability and TLS-based authentication/authorization. Compared to other platforms, pulsar can give you the ultimate tools with more capabilities.
Original article source at: https://www.xenonstack.com/
1669894103
What is Service-Oriented Architecture?
What is a microservice?
Comparison
Original article source at: https://www.c-sharpcorner.com/
1669345440
From my previous blog, you must have got a basic understanding of Microservice Architecture. But, being a professional with certified expertise in Microservices will require more than just the basics. In this blog, you will get into the depth of the architectural concepts and implement them using an UBER-case study.
In this blog, you will learn about the following:
You can refer to the What is Microservices, to understand the fundamentals and benefits of Microservices.
It will only be fair if I give you the definition of Microservices.
As such, there is no proper definition of Microservices aka Microservice Architecture, but you can say that it is a framework which consists of small, individually deployable services performing different operations.
Microservices focus on a single business domain that can be implemented as fully independent deployable services and implement them on different technology stacks.
Figure 1: Difference Between Monolithic and Microservice Architecture – Microservice Architecture
Refer to the diagram above to understand the difference between monolithic and microservice architecture. For a better understanding of differences between both the architectures, you can refer to my previous blog What Is Microservices
To make you understand better, let me tell you some key concepts of microservice architecture.
Before you start building your own applications using microservices you need to be clear about the scope, and functionalities of your application.
Following are some guidelines to be followed while discussing microservices.
Guidelines While Designing Microservices
Now, that you have read through the basic guidelines while designing microservices, let’s understand the architecture of microservices.
A typical Microservice Architecture (MSA) should consist of the following components:
Refer to the diagram below.
Figure 2: Architecture Of Microservices – Microservice Architecture
I know the architecture looks a bit complex, but let me simplify it for you.
1. Clients
The architecture starts with different types of clients, from different devices trying to perform various management capabilities such as search, build, configure etc.
2. Identity Providers
These requests from the clients are then passed on the identity providers who authenticate the requests of clients and communicate the requests to API Gateway. The requests are then communicated to the internal services via well-defined API Gateway.
3. API Gateway
Since clients don’t call the services directly, API Gateway acts as an entry point for the clients to forward requests to appropriate microservices.
The advantages of using an API gateway include:
After receiving the requests of clients, the internal architecture consists of microservices which communicate with each other through messages to handle client requests.
4. Messaging Formats
There are two types of messages through which they communicate:
The next question that may come to your mind is how do the applications using Microservices handle their data?
5. Data Handling
Well, each Microservice owns a private database to capture their data and implement the respective business functionality.Also, the databases of Microservices are updated through their service API only. Refer to the diagram below:
Figure 3: Representation Of Microservices Handling Data – Microservice Architecture
The services provided by Microservices are carried forward to any remote service which supports inter-process communication for different technology stacks.
6. Static Content
After the Microservices communicate within themselves, they deploy the static content to a cloud-based storage service that can deliver them directly to the clients via Content Delivery Networks (CDNs).
Apart from the above components, there are some other components appear in a typical Microservices Architecture:
7. Management
This component is responsible for balancing the services on nodes and identifying failures.
8. Service Discovery
Acts as a guide to Microservices to find the route of communication between them as it maintains a list of services on which nodes are located.
Now, let’s look into the pros and cons of this architecture to gain a better understanding of when to use this architecture.
Refer to the table below.
Pros Of Microservice Architecture | Cons Of Microservice Architecture |
Freedom to use different technologies | Increases troubleshooting challenges |
Each microservice focuses on single business capability | Increases delay due to remote calls |
Supports individual deployable units | Increased efforts for configuration and other operations |
Allows frequent software releases | Difficult to maintain transaction safety |
Ensures security of each service | Tough to track data across various service boundaries |
Multiple services are parallelly developed and deployed | Difficult to move code between services |
Let us understand more about Microservices by comparing UBER’s previous architecture to the present one.
UBER’s Previous Architecture
Like many startups, UBER began its journey with a monolithic architecture built for a single offering in a single city. Having one codebase seemed cleaned at that time, and solved UBER’s core business problems. However, as UBER started expanding worldwide they rigorously faced various problems with respect to scalability and continuous integration.
Figure 4: Monolithic Architecture Of UBER – Microservice Architecture
The above diagram depicts UBER’s previous architecture.
So, if you notice here all the features such as passenger management, billing, notification features, payments, trip management and driver management were composed within a single framework.
Problem Statement
While UBER started expanding worldwide this kind of framework introduced various challenges. The following are some of the prominent challenges
Solution
To avoid such problems UBER decided to change its architecture and follow the other hyper-growth companies like Amazon, Netflix, Twitter and many others. Thus, UBER decided to break its monolithic architecture into multiple codebases to form a microservice architecture.
Refer to the diagram below to look at UBER’s microservice architecture.
Figure 5: Microservice Architecture Of UBER – Microservice Architecture
In this way, UBER benefited by shifting its architecture from monolithic to Microservices.
I hope you have enjoyed reading this post on Microservice Architecture. I will be coming up with more blogs, which will contain hands-on as well.
If you wish to learn Microservices and build your own applications, then check out our Microservices Architecture Training which comes with instructor-led live training and real-life project experience. This training will help you understand Microservices in depth and help you achieve mastery over the subject.
Got a question for us? Please mention it in the comments section of ” Microservice Architecture” and I will get back to you.
Original article source at: https://www.edureka.co/