Software Design by Construx
Introduction
Software Productivity Studies
Code Wars
Referenced study by Tom Demarco and Timothy Lister that showed that fewer lines of code equaled less time required to complete regardless of the programming language. There was an order of magnitude of difference between the largest and smallest solution for a given language.
Managing the Real Leverage in Software Productivity and Quality
A study by Dr. Bill Curtis entitled, “Managing the Real Leverage in Software Productivity and Quality” ad developers examine the same program in which there was a single error. The study asked how long did it take each developer to find the error and fix it?
The results were astounding (all correct answers). The ratio was 1 to 22 from the shortest to longest but even more interesting was their additional finding:
“The correlation between years of experience and performance was essentially zero. We found that the breadth of different programming experiences was the best predictor of performance. … we virtually always found a correlation between performance and the number of languages (and paradigms) known.
The Net Negative Producing Programmer (Gordon Schulmeyer)
C2 Wiki Article
PDF of the article
We've known since the early '60s but have never come to grips with the implications that there are net negative producing programmers (NNPPs) on almost all projects, who insert enough spoilage to exceed the value of their production.
Taking a poor performer off the team can often be more productive than adding a good one.
But it gets better:
In a team of ten, expect as many as three people to have a defect
rate high enough to make them NNPPs. With a normal distribution of
skills, the probability that there is not even one NNPP out of ten is
virtually nil.
And better:
If you are unfortunate enough to work on a high-defect
project (density of from thirty to sixty defects per thousand lines of
executable code), then fully half of your team may be NNPPs.
At least 20% of developers are NNPPs. Correlates with the Demarco and Lister test.
What is Design?
Defining the term
The flow from requirements to test
Requirements -> Design -> Construction -> Test
- Requirements are decisions made by someone else
- Design is made up of decisions that developers make
- Construction and design are different views of the same thing (there isn't a boundary)
Software Design is a description of the software's internal structure that will serve as a basis for its construction
Identify the significant elements of that internal structure. Explain the important relationships between those elements. Provide sufficient detail to construct those elements.
Design as a complexity management tool
Why projects get in trouble: number one reason is bad requirements. Number two is poor project management. Number three is project complexity. One and two are largely outside of the control of the developers. Three is where we should focus our efforts.
- Essential vs accidental complexity
- there's problem space essential complexity and there's not much you can do to simplify that
- accidental complexity is in the solution space
Accidental complexity comes in two different forms: necessary and unnecessary. Necessary: doing something to meet performance or scalability requirements. Unnecessary: doing something because a technique is intersting (Resume driven development).
The goal: minimize the amount of essential and necessary accidental complexity and get rid of unnecessary accidental complexity (refactoring).
Design is a complexity management tool.
Design as a communication tool
- Analysis and validation of proposed designs
- Evaluation of design tradeoffs
- Design as the basis for actual construction
- Design as a resource for unit, integration, and component testing
There's no such thing as self-documenting code. Code will only ever tell you what it does, not what it was intended to do. Does the code "have to" look this way or does it just "happen to" look this way.
Design docs should emphasize why rather than how.
Critical design ideas
In a typical software budget, 80% of the budget is spent on maintaining existing code.
Decreasing maintenance costs is much more powerful than decreasing development costs.
Software can kill people because of dumb design decisions. Prototyping is very useful for exploring and testing various design ideas.
Some design is better than no design, but the best is the enemy of the good.
At some point you just need to shoot the engineer and go into production.
Design is almost always an evolutionary process.
Good design leads to efficiency and effectiveness of both the code and the project of creating the code
Measuring Design Complexity
If you can measure it you can't manage it.
These are heuristics.
Cyclomatic Complexity
The number of pathways through a section of code plus one. Count the decision points. Most control statements add 1 complexity except for else
and default
which doesn't since there is no new decision.
[Schroeder 99]
11% of the codebase were responsible 42% of the codebase. As cyclomatic complexity goes up then defect density goes up with it. Tabs v. spaces, curly brace placement, etc doesn't matter in comparison to complexity. They're cosmetic.
No excuse for cyclomatic complexity of 15 or greater. 10 to 14 we have to have a discussion. 1 to 9 is good.
- Green zone: up to 9
- Yellow zone: 10 to 14
- Red zone: >= 15
The correlation of cyclomatic complexity to defects is strongest at the function (method on a class) level. Weak at the class level and gone at the package level. There, more lines implies a proportionally higher number of defects.
Depth of Decision Nesting
Deeply nested code is more complex even if the cyclomatic complexity is the same since there are dependencies between decisions.
Starts at one with the opening curly brace and whenever you open/close a decision you plus/minus.
- Green zone: up to decision nesting of 4.
- Yellow zone: up to 5 or 6
- Red zone: 7 or greater
Number of Parameters
Building Maintainable Software: Ten Guidelines for Future-Proof Code.
Control interface complexity by limiting the number of parameters to 4.
- Green zone: up to 4
- Yellow zone: 5 or 6
- Red zone: 7 or more
Fan Out
Number of unique calls to other functions (including recursion).
(Fan-out number)^2 / 10
has the best correlation to defect density.
- Green zone: up to 6
- Yellow zone: 7 to 9
- Red zone: >= 10
Local vs. Global Complexity (the interplay of complexity metrics)
Few functions equals greater cyclomatic complexity and the inverse is true.
- Global complexity metrics are: Number of parameters and fan out
- Local complexity metrics are: cyclomatic complexity and depth of decision nesting
Designs with high cyclomatic complexity and high depth of decision nesting are too locally complex.
Designs with a high degree of fan out and a high number of parameters are too globally complex
You can trade local complexity for global complexity and the inverse
Put limits on global and local complexity.
Fan-in: no correlation between fan-in and defection density. It is a measure of reuse rather than complexity.
Fundemental Design Principles
Syntax vs. Semantics
- Syntax: about structure
- e.g. measuring design complexity using the above algorithms
- can be automated
- compiler is a ruthless master of
- Semantics: about meaning
- what the code means
- can't be automated today
- Syntactically correct but semantically meaningless
Bug is a horrible word. Use defects.
Defects are semantic inconsistencies: a difference between the semantics the code has versus the semantics the code should have.
Because syntax is automatible, semantics is even more important.
Principle: Use Abstraction
The principle of ignoring those aspects of a subject that are not relevant to the current purpose in order to concentrate solely on those that are.
—Oxford 97
Abstraction is permission to ignore. Abstraction is THE number one complexity management tool.
From 1952:
When a program has been made from a set of subroutines, the breakdown of the code is more complete than it otherwise would be. This allows the coder to concentrate on one section of a program at a time without the overall detailed program continually intruding.
—Wheeler 52
Abstraction at the cost of performance has always been a point of discussion. Better hardware performance has nullified the performance arguments.
In most interviews, the interviewers spend most of their time asking questions about a specific language. Steve Tockey said that you shouldn't spend time on that but on asking questions about abstraction such as "what is different between these X items." How many different ways you can abstract.
Those who are good at abstracting write the smallest, cleanest code that covers the problem space. They're the best coders. The people that are bad at abstracting write bad code.
A programming language is a learnable skill. Why waste time asking about it? Abstraction is more difficult to learn and therefore usually needs to be baked in.
Principle: Encapsulate Design Decisions
Abstraction and encapsulation are related concepts.
- Abstraction is permission to ignore
- Encapsulation hides the details from you altogether
Large programs that use encapsulation effectively are easier to modify—by a factor of 4—than programs that don't. Encapsulation benefits are both technical and financial for an organization.
It has indisputably proven its worth especially in fast-changing environments. It's important to multiple paradigms including structured and OO.
Required for Encapsulation: Design by Contract
Functions need to be understoon—and specified—in terms of a contract.
-
what it requires beforehand
-
what it guarantees afterward
-
Put contracts into the code
-
Semantic mismatch: the calling programmer has to match the semantic of the called section of code. Semantic mismatch: semantic implemented in the code is not the semantic that the calling programmer assumed.
Elements of a contract
Without design by contract, there can be no encapsulation. Violations of encapsulation commonly break code.
Requires
Constraints:
- on input parameters
- on state
Guarantees
- constraints on output parameters
- constraints on state
- errors/exceptions
- performance limits
Do Not include elements already enforced by the language. It is over and above what the compiler enforces.
Activity: Design by contract
All semantic decisions need to be communicated by the programmer who implements to the programmers who are going to utilize the stack. Communicate through the contract.
Program to and not through an interface. All you should need is the contract. Some contractural changes are safe while others need to be negotiated.
Many defects are essentially contract violations or differing semantic interpretations of an API. By only a very little work (<3% overhead) whole classes of defects are avoided by managing code to the contract.
Manage the semantics at the interface.
Principles: Maximize Cohesion, Minimize Coupling
The overall process of software development is decomposition followed by recomposition. All decompositions are not equal.
Cohesion and coupling help you assess different decompositions and choose from among them.
Cohesion and coupling apply at the method level—methods should be highly cohesive/loosely coupled. The principles also apply to classes, packages, and systems.
Cohesion and coupling are pervasive properties that should apply at all levels of the decomposition.
Cohesion and coupling also apply at the method level: add blank lines to visually decouple code groups that are coehesive wholes.
Cohesion
- Cohesion: indivisibility of a given part. To what extent do pieces of the whole solve a complete problem that's part of the whole?
Count the number of cohesion violations.
An example: using a function bool isLanguageInstalled(int languageCode, bool deleteIfNotTrue)
to both check if the language is installed and delete if the second parameter is false and the language is found (delete operation returns false). This is a semantic violation.
Coupling
Do not connect things that have no business being connected
- What is the decomposition represented by the software?
- Is it highly cohesive, or can the cohesion be improved?
- Is it loosely coupled, or can the coupling be improved?
Law of Demeter: Principle of Least Knowledge
Method M of object O may only invoke methods of:
- O itself
- O's direct component objects
- M's paremeters
- Objects created or instantiated within M
Avoid invoking methods of objects returned by other methods:
a.b().c()
breaks the law while a.b()
does not.
DON'T: aCustomer.ordersList(1).orderLines(1).isMedium().sellingPrice()
- customer shouldn't be aware of order lines or media in order to get the selling price. Too tightly coupled
Loosely coupled
If things do need to be connected, make the connection as loose as possible.
Principle: Design to Invariants
What are the requirements in common between all possible instances of a family of objects?
Base class contains the common functionality.
In what ways are the possible instances different?
Requirements that will remain relevant as long as the product's life cycle are invariants--make those part of the product platform.
Principle: Design for Change
Design in a way that those changes (invariants) are isolated from all of the standard stuff.
Base class contains the common functionality.
Use architectural layering to hide communication protocol variability (REST, SOA, etc). Design patterns hide some important variation behind an abstract interface: adapter, bridge, strategy, factory method, abstract factory, template method, iterator, decorator, proxy, etc
If the requirement will change multiple times during the product's life cycle then you should design for change.
- How much does it cost now to build the simplest possible thing that could work?
- How much will it cost later to make the product do more than the simplest possible thing?
- How much does it cost now to build something that's expandable in the future?
Strategy: Delay the Bindig of Values to Implementation
Attempt to delay the bindings. TurboTax example.
- Named constants instead of magic numbers
- Configuration/preferences/registry files
- Dependency injection
- Inversion of control
- Data-driven design
- Self-configuration
Strategies
Commonalities
Basic set of job control states and events shouldn't change.
Variabilities
- Add a self-test on startup
- Priority queueing
- Cancel jobs while they're still in the queue
- Enable an idle state
Principle: Avoid Premature Optimization
More computing sins are committed in the name of efficiency (withtou necessarily achieving it) than for any other single reason, including blind stupidity.
-- W.A. Wulf
We should forget about small effiecies, say about 97% of the time: premature optimization is the root of all evil.
--Donald Knuth
Jackson's Rules of Optimization:
- Rule 1: Don't do it
- Rule 2 (for experts only): don't do it yet--that is, not until you have a perfectly clear and unoptimized solution.
--M.A. Jackson