Reducing complexity

I feel awful, yet happy, yet tired.

A bit of an annoyance is when you spend hours doing something, then a week or two later you see a new library that does that and simplifies the code greatly; allowing maintenance and readability at the same time.

So, take it as you may but if you are using classes like QXmlStreamReader or QDomDocument do yourself a favor and try pugixml library. It’s said to be faster than the previously mentioned. In terms of code readability QDomDocument is the nearest that matches pugixml due to its DOM api, yet it’s also slower and consumes a hefty amount of memory in exchange, where as pugixml stays lean.

The inadequacy of HTML, delving into C, and pondering remote jobs

I’ve been working with HTML in low-level languages like C/C++ and I must say that it’s been a rather frustrating experience. One would think that the task I’m doing should have been done ages ago, since the conception of the project. Alas, that’s not quite right. What you fail to realize is that you should NEVER underestimate the dirtiness of HTML outputs all around the world. This whole week has been about cleaning/scrubbing/sanitizing HTML. I ended up grabbing to rather awesome third party libraries, one called pugixml and the other libtidy (also known as tidy-html5).

libtidy, being a pure C library took me a while to get the hang of as I’m not that experienced in C. Even though I’m writing C++ it doesn’t mean I’m working with pointers the majority of time (mostly on the stack). It was more of a game of “copy this chunk of memory and put it as an argument in this C++-powered method so I have more control over it with the classes I’ve built. But I needed it to work, regardless on how long it took to get right because without tidy it would be really, really hard to do any scrubbing. pugixml and other XML parsers can get quite cranky while parsing misplaced tags.

Which brings me to pugixml!

Apparently QtXml module is no longer maintained as it has reached a matured state. The array of classes QDomDocument have in general feels suiting to do the job, sadly the documentation stresses that it uses too much memory, and apparently “pugixml is faster than QXmlStreamReader”

In the end. I chose pugixml because it’s just simple, it doesn’t have the annoyances QXmlStreamReader brings.

Another subject I wanted to bring is the interest of the C language.

I’ve been thinking of learning more about C. At least the basics to defend myself from cases like libtidy. I also wanted to learn more about C due to Gtk3 and due to getting closer to system programming (not that C++ isn’t capable). Mix this interest with me wanting to work with microcomputers or embedded systems and I might make something out of it.

Lastly there’s that thought of me wanting to get a job related to my field. I might try to get a job remotely, well, time will tell. For now I’m too tired, mentally spent, yet as I finish this post I have a class to attend to in college. 🙁

SQLite and the importance of indexes

I didn’t spend too much time on this issue, but as someone who knows the basics to intermediate in SQL it still caught me off-guard on how easy it is to have insanely slow query performances.

Take an innocent query like this. Once would think it has some sort of meta information stored already, but it doesn’t

select mytable.*, 
   (select count(*) 
    from mytable2 
    where mytable2.id=mytable.id ) as count
 from some_table;

What it does is that it will get all the columns in mytable and additionally it performs a COUNT based on id in a subquery thus the result is the total count per ID.

It took 7662ms to perform the query on two tables that have 387, and 4,800+ rows respectively. That’s not acceptable in any way, putting that sort of query in production will make your customers leave eventually due to the slowness of the system.

I googled a bit on the issue and found that sqlite doesn’t store any meta or index. I went ahead and create an index based on the second table, as the first one already has an index (primary key).

I managed to trim those seconds to 240ms, which imho is pretty acceptable, yet makes me feel uneasy.

The result?

select mytable.*, 
   (select count(ROWID)
    from mytable2 
    indexed by my_awesome_index
    where mytable2.id=mytable.id ) as count
 from some_table;

That’s all it took.

I’m still not completely satisfied with results like this, I will continue finding a way to trim those 200ms away to 20 or 50. I have an idea on how to do it but for now this will do.

Repositories

If by any chance you click whatever github link related to my account I have to say that I’ve removed myself from Github and now use BitBucket as my go-to needs for “backing up” and collaborating. There’s also GitLab but I have found that their interface and site is so… so slow, compared to BitBucket.

Support

Yesterday, one of the most beloved wrestlers in the WWE, Daniel Bryan, announced his retirement followed by one of the most passionate, honest goodbyes in the WWE. Daniel retired due to medical reasons he was set to the final slot yesterday in Raw followed by the WWE Network for an extended time.

8v4kQ14

It was heartbreaking.

I love wrestling, really do. Seeing one of the wrestlers you cheered for go down for whatever reason hurts to some extent. Today though, I was reflecting on some of his words, his interactions with his family, and so many other things that made question, “what am I doing?”.

My interactions with people hasn’t been the best; I feel like most of the time I have spent has been wasted. I’ve been ungrateful. I seldom smile honestly, it’s something that I have taught myself to do over the years. I want to learn how to “act” normal, while missing the point of being yourself.

I wish I could say I tried, but this isn’t the case. I have given up so many times without trying, yet you see this guy, Daniel? He’s a fighter, he kept pushing on even with all the injuries. I didn’t. I always come around, really late, pick myself up and push once again.

Which is why it was never a matter of “trying harder” but having the desire to want it. There’s a lot of issues I want to resolve… I guess time heals old wounds. I did learn a lot yesterday, I just don’t think I can put it into words.