The TransEvents Table

storeboss table

Much of the ensuing discussion will be about this table.  Please click here to see its the early beta version its DDL and load file DML.

There will be significant changes to this data object in the weeks ahead.  Stayed tuned for more info, as I further progress in the development of StoreBoss.

Please note that I am targeting MySQL on the Server, which means that much of the advanced temporal discussion on this blog will not apply directly to StoreBoss.

Also, most hosting providers do not allow users to save SQL output to a custom file location — unless they are paying for some super expensive plan.  With this in mind, I’m developing StoreBoss as a WP plugin that populates data to this (and perhaps other) custom tables. I will try to do most if not all of the data prep using SQL, given that the row counts are likely to be small enough that StoreBoss will not significantly impact production.

The data it contains could be downloaded, if necessary, for further processing by an end-user.


Unlocking WC Orders

free yourself

To regular store owners, the WC REST API is intimidatingly useless — a roadblock that does not allow a simple way to run customized or unique ad hoc queries they might wish to run.

To SQL-savvy data analysts who prefer to use straight-up relational declarations (that are not chopped up into little pieces by SQL Injection-driven paranoia), retrieving sales info through the REST API via some procedural intermediate language is an inelegant impedance mismatch hack: a baroque apparatus that is oppressive, particularly if an app does not expose the RDMBS to the outside world, and is only meant for inhouse consumption by trusted users who can be granted various levels of authorization.

Why not just get at the data directly?

sql query
Q1 and its result set, run using PHPMYADMIN

After testing various forms of Q1 (the query I’ve been playing around with during the last few posts), I arrived at a version that provides a succinct way of presenting Order-related information.

    p.ID AS "Trans ID",
    p.post_date AS "Trans Datetime",
    c.comment_content AS "Event Status",
    c.comment_date AS "Event Datetime",
    o.meta_value AS "Order Total"
    wp_posts p
LEFT JOIN wp_comments c ON
        p.post_type = "shop_order" AND p.ID = comment_post_ID AND c.comment_type = "order_note" AND c.comment_content NOT LIKE "Item%" AND c.comment_content NOT LIKE "Stock%"
INNER JOIN wp_postmeta o ON
        o.`meta_key` = "_order_total" AND p.ID = o.post_id
INNER JOIN wp_postmeta x3 ON
        x3.`meta_key` = "_completed_date" AND p.ID = x3.post_id
ORDER BY p.post_date;

As shown up top, Q1 produces a result set that lists by ascending date the predetermined state transformations that every Order (aka transaction) goes through in the WC database.

I have added (to previous incarnations of this query) two LIKE operators to eliminate superfluous events from the Orders event-driven timeline.

This does not completely remove the return of all non-payment related events (such as inventory changes) that are recorded in the wp_comments table.

The mask format of these comments has changed over the years, and it would be inefficient to add too much string processing to Q1 to account for every variation; besides, WC has lately seemed to stabilize how it persists inventory change information, as of 2019, so this should not be too much of a problem, going forward.

Q1 results can be saved to a local file on the server (with a slight modification to the code), then loaded to a custom table that shows :

  1. when a Sale was recorded in WC;
  2. when STRIPE (which is the payment processor I am using for test purposes) processed the CC payment, and,
  3. when the order itself was marked as completed (indicating that it was fulfilled by the merchant).

As I’ll demo in the next post, this custom Orders table can then be queried by an end-user to produce temporal analytics.

The query also results in other information of interest, such as when a card was declined, or if a customer was sent a confirmation email.

I tested the query in a production WC environment with thousands of records in the wp_posts table.  It works correctly, and is also quite fast — while utilizing subseconds of CPU time.

wc stripe charge
Here a real-world result set. I ran Q1 against a production WC site that I manage. Here we can see a Stripe Charge event being recorded against a $503 sale that took place this past Sunday (on 8-11-19). Note the transaction time is captured down to the second.

However, I have not stress-tested the query in a huge database with millions of orders.

Any such merchant will have an in-house IT department that performs these sort of backend tasks; the idea with StoreBoss (which is the code name of the plug-in app I’m building) is to provide DSS functionality on a Windows or Mac desktop, using SQLite and Python — from data that is derived (or “unlocked,” if you will) from the WC database — hence the catchy Widgeteria tag line, “AI for the little guy.” (‘Guy’ here being generic, not gender-specific.)

Armed with this simple query, and a vanilla script front end that sits in the WP admin panel, store owners will have an additional way (meaning, in addition to what WC offers, free or pay, by way of downloadable Orders data) of downloading their order data to the desktop.

But even within the WP admin panel env itself, the data is presented such as to be easily readable.  It also permits the ad hoc querying, in a straight forward manner, of temporal state changes that an Order goes through in  WC — without causing locks on the WC OLTP env.

How cool is that?!

WC order panel
I manually entered this order using the WC admin order panel. I used the sample product db provided by WC; unfortunately, WC does not provide a simple way to upload a testbed transaction db. Testing has to be done manually unless you are willing to try your hand at this. Btw, note that transaction seconds are not displayed in the WC Order panel.

In production, the query will be implemented as a parameterized stored procedure, one that accepts sanitized date ranges.

It should also be possible to run this query as a MariaDB event (like a cron file), one that runs, say, at midnight + 1 min every 24 hours, in order to produce a running tally of the previous day’s order processing, which could be then downloaded to the desktop for DSS or tick-and-tie processing.

wc order analytics
It seems that the WC Order analytics report doe not completely love Orders that are manually entered!

But for now, I am still experimenting with how to extract the simplest, cleanest data set possible from the WP database without having to deal with the complex and resource-consuming WooCommerce REST API.

Q1 seems to be a good start.

WP set date panel
When doing temporal analytics, make sure you first set WP to the correct time. In this case, I set it to EST. This will save a lot of headaches later on, when we perform some basic date math. The subject of temporal databases, particularly distributed ones, is vast; I will touch upon relevant aspects in various posts to come.

We are not too far away now from figuring out how to match STRIPE payment data to the receipt of actual funds (some unit of time later, minus any haircut) in a real bank account for honest-to-goodness real-world transaction verification!


Woo Orders – Part 2

sales receipt

WooCommerce relies on a 2 table schema — one fat, one skinny — as the primary method for constructing Order data.  A variety of other tables provide auxiliary information inflight.

It is difficult for a non-database professional to extract data from this sink.

Woo does not expect its customers to know how to write PhP code.  In Woo’s view, this is the job of specialists, sort of like old school IT departments —  ie, highly trained developers who are versed in the “technology stack” that is the hallmark of modern Web development.

Woo expects these specialists to use an interface called a “REST API” to extract or manipulate data that persists in the Woo/WP datastore.

Absent is the alluring promise of relational databases — with their declarative, somewhat verbose, set-based programming sublanguages, that in theory should allow a programmer to largely be unconcerned with how structured data is saved to disk or retrieved.

Thus, a non-technical Woo store owner cannot with ease directly query the Woo database in an ad hoc fashion; for example, to independently verify and/or audit the validity of Woo’s assertion as to what it holds to be relational facts.

Specialized, rather arcane knowledge, is required.

In my previous post, we saw how to extract a subset of Order data, particularly if we are interested in performing a form of applied ELH analysis to a WooCommerce Order — in other words, how does a Woo order get born, live, then die?

A simple query with only 1 inner join was needed to populate (insert) an ISAM DSS table with Order transaction rows.

We also saw how an Order passes through multiple states.

Woo maintains a log of these states in a table called wp_comments.  Rows with the field comment_type equal ‘order_note’ trace each step along the way.

That is a valuable table, a crucial one, in fact, for the purpose of creating a temporal ELH table.

A common way to capture temporal info is to declare a point-based data model, which is what we shall do with a table called trans_1.

The role of trans_1 is to:

  1.  Provide information relating to Order-level state transitions
  2. Create an efficient means to download or extract Order data from Woo without having to deal with its REST API
  3. Serve as an independent accounting-level verification for the data that is provided by Woo’s reports

Let’s assume for a moment, as a learning exercise, that there is no wp_comments table.  This will help us better understand the Woo Schema and how it uses WP’s underlying tables to track Orders data.

In such a case, how could we populate trans_1?

Leaving aside performance considerations, a trigger might be an attractive option to automatically populate our custom Trans (ie, Order) table. This, in fact, is one of the traditional solutions in data processing to this classic problem; another is having an app write twice, once to the TRANS table in question, the other to an associated historical, timestamped “versioning” table.  These normally feature rolling time-period views (such as yearly, semi-annually, quarterly, monthly, weekly, and daily).

The trigger approach would eliminate the need to write a cron job that would periodically run the INSERT query that we saw in the previous post. This begs the question, how frequently should it poll the wp_posts table? How do we ensure that state transitions are not missed?

To sidestep this problem, triggers may offer a low-code solution.

I will try to demonstrate how this might work, with an Insert trigger, MariaDB-style.  Essentially the trigger populates Table B, every time a row is inserted in Table A.

(In our real-world Order example, Table B would have to have, also, an Update trigger, to retrieve Transaction Amt order-level data from the auxiliary EVA table that the wp_posts table uses to construct virtual Order tables on the fly. This sucks, but there you have it.)

Here goes.

First, I will create a table in PHPMYADMIN that replicates or mirrors the structure of the sb_trans_1 table that we saw in the previous post.  I will call this table sb_trans_2.

`Trans_ID` bigint(20) NOT NULL, 
`Trans_status` varchar(20) NOT NULL, 
`Trans_date` datetime NOT NULL,
`Trans_amt` decimal(13,2) NOT NULL, 
PRIMARY KEY (`Trans_ID`, `Trans_status`) ) 

Here it is, in my local Sandbox MariaDB database.

Next, I will attach a trigger to sb_trans_1.  For demo purposes, I will use an artificial firing condition — that the Order Number exceeds an Order I manually created in the database.  In a real-world implementation, the test condition would, of course, be to perform a `post_type` = “shop_order” column eval on the wp_posts table.

Here’s the trigger:

CREATE TRIGGER addrow_sb_trans_2
AFTER INSERT ON sb_trans_1
IF NEW.Trans_ID > "117" THEN
INSERT INTO sb_trans_2 (Trans_ID, Trans_status, Trans_date, Trans_amt)
END; //

Now let’s add an order to the trans_1 table, #118. We can use the Insert Panel on PHP admin to accomplish this task.  As you can see below, trans_2 is automatically updated.

Conclusion:  the trigger idea shows promise — maybe.

Let’s try it on wp_posts, just for laughs.  This is the Big Kahuna table in the WP/WOO database universe.  We add the following trigger.

CREATE TRIGGER addrow_sb_trans_1 
IF NEW.post_type = "shop_order" 
THEN INSERT INTO sb_trans_1 
(Trans_ID, Trans_status, Trans_date) 
VALUES (NEW.ID, NEW.post_status, NEW.post_date);
END; //

Now let’s try to add a row using WooCommerce -> Orders in the WP admin panel.


And here is some further unexpected behavior.

The naive moral of this exercise is, don’t use triggers on WP/WC tables,

Out of curiosity, I investigated what caused the Fatal Error.  It turns out that our seemingly innocuous trigger resulted in the invocation of code that fires off a call at Line 160 that crashes the plugin. Why this happened shall forever remain one of the mysteries of life.

brackets code

Btw, I also tried, just for kicks, attaching a trigger to the wp_wc_order_stats, which is also populated with Orders data.

woo order stats table

This, too, didn’t work, but we might use this table down the road, as it provides a useful breakdown of the dollar amt descriptors that comprise a total Order amount.

In conclusion, it’s a good job that wp_comments exists.

But we’ve learned quite a bit about how Orders are handled internally in the WP database.  This will help us construct the SELECT queries that we will be needing.

In my next post, I will create a parameterized stored procedure that extracts and transforms Order data from the Woo tables mentioned above and loads the result into trans_1. Most likely I will write the result set to disk, then load the file into the table, as this is far faster than doing inserts, and I won’t have the transaction overhead to deal with. Besides, trans_1 is an ISAM table:  runs like greased lightning for DSS queries, not so fast OLTP.

From a data architecture standpoint, trans_1 will provide an ad hoc means to populate StoreBoss™ with dynamic Order data on a temporal basis.

All I would then have to do is put together a simple PHP GUI WP plug-in that supports CSV exporting to a local machine, and we can go on from there.

Most likely the trans_1 SP would look something like this:

p.ID AS "Trans ID",
p.post_date AS "Trans Date",
c.comment_content AS "Trans Status", 
c.comment_date AS "Status Date",
o.meta_value AS "Trans Amt"
FROM wp_posts p
LEFT JOIN wp_comments c ON
(p.post_type = "shop_order"
AND p.ID = comment_post_ID 
AND c.comment_type = "order_note")
INNER JOIN wp_postmeta o ON 
(o.`meta_key`= "_order_total"
AND p.ID = o.post_id)
ORDER BY p.post_date;

which results in

sql result