I have this query in my php and it seems to be fetching the wrong data set from my db.
$querystring = "SELECT a.*, b.itemColour,b.itemColourName FROM itemorders AS a INNER JOIN CatalogueItemsColour AS b ON a.colourID = b.colourID WHERE a.colourID IN(SELECT colourID FROM itemorders where orderID = 61 ) ";
Here is a picture of my results
Can i know why its not selecting the specific orderID of 61?
HTML to Markdown
MySQL is the all-time number one open source database in the world, and a staple in RDBMS space. DigitalOcean is quickly building its reputation as the developers cloud by providing an affordable, flexible and easy to use cloud platform for developers to work with. MySQL on DigitalOcean is a natural fit, but what’s the best way to deploy your cloud database? In this post, we are going to compare the top two providers, DigitalOcean Managed Databases for MySQL vs. ScaleGrid MySQL hosting on DigitalOcean.
At a glance – TLDR
ScaleGrid Blog - At a glance overview - 1st pointCompare Throughput
ScaleGrid averages almost 40% higher throughput over DigitalOcean for MySQL, with up to 46% higher throughput in write-intensive workloads. Read now
ScaleGrid Blog - At a glance overview - 2nd pointCompare Latency
On average, ScaleGrid achieves almost 30% lower latency over DigitalOcean for the same deployment configurations. Read now
ScaleGrid Blog - At a glance overview - 3rd pointCompare Pricing
ScaleGrid provides 30% more storage on average vs. DigitalOcean for MySQL at the same affordable price. Read now
MySQL DigitalOcean Performance Benchmark
In this benchmark, we compare equivalent plan sizes between ScaleGrid MySQL on DigitalOcean and DigitalOcean Managed Databases for MySQL. We are going to use a common, popular plan size using the below configurations for this performance benchmark:
ScaleGridDigitalOceanInstance TypeMedium: 4 vCPUsMedium: 4 vCPUsMySQL Version126.96.36.199.20RAM8GB8GBSSD140GB115GBDeployment TypeStandaloneStandaloneRegionSF03SF03SupportIncludedBusiness-level support included with account sizes over $500/monthMonthly Price$120$120
As you can see above, ScaleGrid and DigitalOcean offer the same plan configurations across this plan size, apart from SSD where ScaleGrid provides over 20% more storage for the same price.
To ensure the most accurate results in our performance tests, we run the benchmark four times for each comparison to find the average performance across throughput and latency over read-intensive workloads, balanced workloads, and write-intensive workloads.
In this benchmark, we measure MySQL throughput in terms of queries per second (QPS) to measure our query efficiency. To quickly summarize the results, we display read-intensive, write-intensive and balanced workload averages below for 150 threads for ScaleGrid vs. DigitalOcean MySQL:
ScaleGrid MySQL vs DigitalOcean Managed Databases - Throughput Performance Graph
For the common 150 thread comparison, ScaleGrid averages almost 40% higher throughput over DigitalOcean for MySQL, with up to 46% higher throughput in write-intensive workloads.
#cloud #database #developer #digital ocean #mysql #performance #scalegrid #95th percentile latency #balanced workloads #developers cloud #digitalocean droplet #digitalocean managed databases #digitalocean performance #digitalocean pricing #higher throughput #latency benchmark #lower latency #mysql benchmark setup #mysql client threads #mysql configuration #mysql digitalocean #mysql latency #mysql on digitalocean #mysql throughput #performance benchmark #queries per second #read-intensive #scalegrid mysql #scalegrid vs. digitalocean #throughput benchmark #write-intensive
MySQLis an open-source Relational Database Management System (RDBMS) backed by Oracle. MySQL is made up of two words- “My”, the name of co-founder Michael Widenius’s daughter, and “SQL”, the abbreviation for Structured Query Language.
The word Relational in RDBMS is the fundamental on which JOINS in MySQLwork. There can be many types and numbers of tables in any Database. JOINS as the name suggests link tables together using common columns. One basic example is, Table 1 has employee details and Table 2 has project details.
If we have to check the projects an employee is involved in, we can join the tables on employee name or employee id, hence doing so will join the tables and the new virtual table (can be saved) can be accessed with all the information required.
#sql #sql joins #mysql #mysql joins
One of the most frequently used transformations in Apache Spark is Join operation. Joins in Apache Spark allow the developer to combine two or more data frames based on certain (sortable) keys. The syntax for writing a join operation is simple but some times what goes on behind the curtain is lost. Internally, for Joins Apache Spark proposes a couple of Algorithms and then chooses one of them. Not knowing what these internal algorithms are, and which one does spark choose might make a simple Join operation expensive.
While opting for a Join Algorithm, Spark looks at the size of the data frames involved. It considers the Join type and condition specified, and hint (if any) to finally decide upon the algorithm to use. In most of the cases, Sort Merge join and Shuffle Hash join are the two major power horses that drive the Spark SQL joins. But if spark finds the size of one of the data frames less than a certain threshold, Spark puts up Broadcast Join as it’s top contender.
Looking at the Physical plan of a Join operation, a Broadcast Hash Join in Spark looks like this
The above plan shows that the data frame from one of the branches broadcasts to every node containing the other data frame. In each node, Spark then performs the final Join operation. This is Spark’s per-node communication strategy.
Spark uses the Broadcast Hash Join when one of the data frame’s size is less than the threshold set in spark.sql.autoBroadcastJoinThreshold. It’s default value is 10 Mb, but can be changed using the following code
spark.conf.set("spark.sql.autoBroadcastJoinThreshold", 100 * 1024 * 1024)
This algorithm has the advantage that the other side of the join doesn’t require any shuffle. If this other side is very large, not doing the shuffle will bring notable speed-up as compared to other algorithms that would have to do the shuffle.
Broadcasting large datasets can also lead to timeout errors. A configuration spark.sql.broadcastTimeout sets the maximum time that a broadcast operation should take, past which the operation fails. The default timeout value is 5 minutes, but it can be set as follows:
If neither of the data frames can be broadcasted, then Spark resorts to Sort Merge Join. This algorithm uses the node-node communication strategy, where Spark shuffles the data across the cluster.
Sort Merge Join requires both sides of the join to have correct partitioning and order. Generally, this is ensured by** shuffle and sort** in both branches of the join as depicted below
#apache spark #scala #tech blogs #broadcast join #join opertaions #join optimization #joins in spark #shuffled hash join #sort merge join
CROSS JOIN is in the spotlight. This article finishes our small series of SQL JOIN-related publications.
SQL Server CROSS JOIN is the simplest of all joins. It implements a combination of 2 tables without a join condition. If you have 5 rows in one table and 3 rows in another, you get 15 combinations. Another definition is a Cartesian Product.
Now, why would you want to combine tables without a join condition? Hang on a bit because we are getting there. First, let’s refer to the syntax.
#sql server #cross join #inner join #outer join #sql join #sql
MySQL does not limit the number of slaves that you can connect to the master server in a replication topology. However, as the number of slaves increases, they will have a toll on the master resources because the binary logs will need to be served to different slaves working at different speeds. If the data churn on the master is high, the serving of binary logs alone could saturate the network interface of the master.
A classic solution for this problem is to deploy a binlog server – an intermediate proxy server that sits between the master and its slaves. The binlog server is set up as a slave to the master, and in turn, acts as a master to the original set of slaves. It receives binary log events from the master, does not apply these events, but serves them to all the other slaves. This way, the load on the master is tremendously reduced, and at the same time, the binlog server serves the binlogs more efficiently to slaves since it does not have to do any other database server processing.
Ripple is an open source binlog server developed by Pavel Ivanov. A blog post from Percona, titled MySQL Ripple: The First Impression of a MySQL Binlog Server, gives a very good introduction to deploying and using Ripple. I had an opportunity to explore Ripple in some more detail and wanted to share my observations through this post.
Ripple supports only GTID mode, and not file and position-based replication. If your master is running in non-GTID mode, you will get this error from Ripple:
Failed to read packet: Got error reading packet from server: The replication sender thread cannot start in AUTO_POSITION mode: this server has GTID_MODE = OFF instead of ON.
You can specify Server_id and UUID for the ripple server using the cmd line options: -ripple_server_id and -ripple_server_uuid
Both are optional parameters, and if not specified, Ripple will use the default server_id=112211 and uuid will be auto generated.
While connecting to the master, you can specify the replication user and password using the command line options:
-ripple_master_user and -ripple_master_password
You can use the command line options -ripple_server_ports and -ripple_server_address to specify the connection end points for the Ripple server. Ensure to specify the network accessible hostname or IP address of your Ripple server as the -rippple_server_address. Otherwise, by default, Ripple will bind to localhost and hence you will not be able to connect to it remotely.
You can use the CHANGE MASTER TO command to connect your slaves to replicate from the Ripple server.
To ensure that Ripple can authenticate the password that you use to connect to it, you need to start Ripple by specifying the option -ripple_server_password_hash
For example, if you start the ripple server with the command:
rippled -ripple_datadir=./binlog_server -ripple_master_address= <master ip> -ripple_master_port=3306 -ripple_master_user=repl -ripple_master_password='password' -ripple_server_ports=15000 -ripple_server_address='172.31.23.201' -ripple_server_password_hash='EF8C75CB6E99A0732D2DE207DAEF65D555BDFB8E'
you can use the following CHANGE MASTER TO command to connect from the slave:
CHANGE MASTER TO master_host='172.31.23.201', master_port=15000, master_password=’XpKWeZRNH5#satCI’, master_user=’rep’
Note that the password hash specified for the Ripple server corresponds to the text password used in the CHANGE MASTER TO command. Currently, Ripple does not authenticate based on the usernames and accepts any non-empty username as long as the password matches.
It’s possible to monitor and manage the Ripple server using the MySQL protocol from any standard MySQL client. There are a limited set of commands that are supported which you can see directly in the source code on the mysql-ripple GitHub page.
Some of the useful commands are:
SELECT @@global.gtid_executed;– To see the GTID SET of the Ripple server based on its downloaded binary logs.
STOP SLAVE;– To disconnect the Ripple server from the master.
START SLAVE;– To connect the Ripple server to the master.
#cloud #database #developer #high availability #mysql #performance #binary logs #gtid replication #mysql binlog #mysql protocol #mysql ripple #mysql server #parallel threads #proxy server #replication topology #ripple server