> Database design is not hard ... I'm kind of curious how any of these problems are solved by an ORM
Database design is indeed not hard (for the vast majority of use cases), but we can't predict the future. I'll give you an example from our app.
We've recently had to update the schema to support multiple different tax amounts for transactions. This is several different tables in the old schema, which all now have an extra child table each, and a lookup table to store the tax rates, to write/read those tax amounts for the transactions (basically a 1 .. n table for each parent table). Pretty basic stuff.
Anyway, historically, i.e. when the app was written 15 years ago, it was in a single market with a single tax rate on transactions. The original developers didn't foresee that the platform would turn into a multi-billion turnover level system, that would expand into different countries which have their own complex tax rules. It made sense that you have an amount column and a tax_amount column. Now it doesn't make sense, because hell no i want this done right and we're not going to add more columns we're actually going to model and normalise this correctly and move the tax amount(s) into child tables.
Because the older code had all of the interaction with the database not so abstracted, poorly factored if you will, we had to update dozens (i forget how many) of queries to add all the new joins and of course it needs to be back/forwards compatible, and we have to roll this out piece by piece because (as i said above) this is a multi-billion turnover level app.
The existing queries are in the region of 50+ lines of SQL with a dozen or so joins in some case. That's not actually complex SQL, it's pretty mundane and trivial SQL - SELECT a few columns, JOINing a few other tables, WHERE stuff matches conditions. There's the occasional IF, CASE, COALESCE, statement in these queries as well, and the odd UNION here and there. Again, pretty simple stuff. There are a few nice surprises from it being an old schema, polymorphic relations for example.
The changes amounted to hundreds of lines of SQL additions, the equivalent in actual "getting stuff done" code additions, making sure this is all tested so there are no regressions in moving from the old tax column in the parent table to getting the tax amounts out of the new tables, and weeks of testing and logging to find any edge cases or bugs.
In the newer model code that wraps all this stuff in an ORM? A few lines of code, because we can just compose a role into the model classes giving them access to a tax_amounts method that returns the details of each transaction's multiple tax amounts + rates + other stuff. The ORM knows about the new tables, the queries run by it add in all the necessary joins for us. The model classes are correctly factored to compose the Taxes role.
Now you can argue that if the original code had been better factored then we wouldn't have been in the same situation, and it would have been easier, but we would still have had to update multiple queries. This is where the ORM shines and it's why I'm tired of writing trivial SQL having done so for twenty years against half a dozen different database engines. I really don't want to have to write SELECT a few columns, JOINing a few other tables, WHERE stuff matches conditions type queries again. Unless of course I'm sat in the database terminal client, which happens more times a day than I can COUNT.
That said, your old and new solutions both sound wrong to me.
Tax is a line item.
To me it's as much a line item as a product, service, shipping, handling, booking fee, or any other thing that gets summed to form the total order / transaction value.
You then get to ref optionally in the way that makes sense to map attribution correctly.
Tax belongs to an item? No problem. Two taxes, one item, no problem! One tax multiple items? No problem!
> Not really sure how ORM solves anything here. That said, your old and new solutions both sound wrong to me.
The new solution does what you talk about, I guess I didn't explain it very well.
The use of an ORM is abstracting away a lot of the technical debt in the schema and isolating it so we can concentrate on fixing that debt without the worry that it continues to spread. It's much looser coupling than continuing to litter the codebase with manually written SQL.
Edit: reading more of the replies to your original comment - there is btschaegg's comment that having a narrow scope is what you should aim for[1]. There's also a reply (first one) on stackoverflow that talks about not leaking the abstraction[2]. Finally the key to our use of an ORM is that the ORM is not our model, we do not leak the abstraction, and we can drop down to manual SQL when we need it but that is still in a narrow scope.
> The use of an ORM is abstracting away a lot of the technical debt in the schema and isolating it so we can concentrate on fixing that debt without the worry that it continues to spread.
Is there a good reason you're abstracting over technical debt instead of just paying it down?
I mean I get it, its convenient, but now you have at least two sources of truth. One for your database, one for your application, and then god knows how many for the rest of the company.
If you fix your underlying schema things will break. Good. You might see it.
> It's much looser coupling than continuing to litter the codebase with manually written SQL.
How is SQL litter? Its as much litter as fopen.
How is it much looser coupled? To what? The data? Why would you even want that?
Loose coupling is only a benefit when the cost of breaking change is significantly higher than the cost of an extra layer of abstraction and indirection.
My core problem with ORM's is that I'm not sold it's a net positive ROI. It has benefits, sure, but it comes at a steep price I'm not prepared to pay.
Kind of agree with a lot of it but it breaks down pretty rapidly.
> Instead, with a properly designed database and an OLTP use case, ORMs are golden. Most of the grunt work goes away and with tools such as DBIx::Class::Schema::Loader, I can go from a good database schema to working Perl code in minutes.
I don't actually care about grunt work, writing code is like 5% of my time spent. I also question that its even a true statement. ORM's tend to lead to lots of very customized queries that are never abstracted. It also leads to a lot of duplicate code that is hard to detect.
> In reality, what happens is that people leak their ORM code all over their controllers (and views) and when they hit scalability issues, they start blaming the ORM rather than their architecture. The ORM gets a bad rap (I see this repeatedly for many clients). Instead, hide that abstraction so that when you've genuinely hit ORM limits, you can choose appropriate solutions for your problem rather than let code be so tightly coupled to the ORM that you're hog-tied.
So the suggestion is to not only use an ORM to abstract the DB+SQL away from your application but to also then abstract your ORM away from your code. What's next? Abstracting your AbstractORM into an AbstractAbstractORM? Can you see the problem here? If I have to abstract the DAL from my code base (and I do) why not just do it once?
...
> As Rob Kinyon made clear above, reporting tends to be a weakness in ORMs. This is a subset of a larger problem where complicated SQL or SQL which spans multiple tables sometimes doesn't work well with ORMs. For example, sometimes the ORM forces a join type I don't want and I can't tell how to fix that. Or maybe I want to use an index hint in MySQL, but it's not easy. Or sometimes the SQL is just so darned complicated that it would be nicer to write the SQL rather than the abstraction provided.
So if I blame ORM's its my architectures fault, but when you do it its just "other limitations". I think if the underlying SQL is easier to write than the abstraction code then the abstraction failed, or to my original point is just straight up half baked and incomplete.
(lots of "yous" I know you didn't write this, assume I'm being pretty general, none of this is personal)
Database design is indeed not hard (for the vast majority of use cases), but we can't predict the future. I'll give you an example from our app.
We've recently had to update the schema to support multiple different tax amounts for transactions. This is several different tables in the old schema, which all now have an extra child table each, and a lookup table to store the tax rates, to write/read those tax amounts for the transactions (basically a 1 .. n table for each parent table). Pretty basic stuff.
Anyway, historically, i.e. when the app was written 15 years ago, it was in a single market with a single tax rate on transactions. The original developers didn't foresee that the platform would turn into a multi-billion turnover level system, that would expand into different countries which have their own complex tax rules. It made sense that you have an amount column and a tax_amount column. Now it doesn't make sense, because hell no i want this done right and we're not going to add more columns we're actually going to model and normalise this correctly and move the tax amount(s) into child tables.
Because the older code had all of the interaction with the database not so abstracted, poorly factored if you will, we had to update dozens (i forget how many) of queries to add all the new joins and of course it needs to be back/forwards compatible, and we have to roll this out piece by piece because (as i said above) this is a multi-billion turnover level app.
The existing queries are in the region of 50+ lines of SQL with a dozen or so joins in some case. That's not actually complex SQL, it's pretty mundane and trivial SQL - SELECT a few columns, JOINing a few other tables, WHERE stuff matches conditions. There's the occasional IF, CASE, COALESCE, statement in these queries as well, and the odd UNION here and there. Again, pretty simple stuff. There are a few nice surprises from it being an old schema, polymorphic relations for example.
The changes amounted to hundreds of lines of SQL additions, the equivalent in actual "getting stuff done" code additions, making sure this is all tested so there are no regressions in moving from the old tax column in the parent table to getting the tax amounts out of the new tables, and weeks of testing and logging to find any edge cases or bugs.
In the newer model code that wraps all this stuff in an ORM? A few lines of code, because we can just compose a role into the model classes giving them access to a tax_amounts method that returns the details of each transaction's multiple tax amounts + rates + other stuff. The ORM knows about the new tables, the queries run by it add in all the necessary joins for us. The model classes are correctly factored to compose the Taxes role.
Now you can argue that if the original code had been better factored then we wouldn't have been in the same situation, and it would have been easier, but we would still have had to update multiple queries. This is where the ORM shines and it's why I'm tired of writing trivial SQL having done so for twenty years against half a dozen different database engines. I really don't want to have to write SELECT a few columns, JOINing a few other tables, WHERE stuff matches conditions type queries again. Unless of course I'm sat in the database terminal client, which happens more times a day than I can COUNT.