jump to navigation

An Analogy for Types January 14, 2017

Posted by PythonGuy in Uncategorized.
1 comment so far

Sometimes the best way to teach a principle is to share an analogy.

Let’s come up with an analogy of types. Let’s say your program is a set of instructions you give to Amelia Bedelia.

Now, suppose Amelia Bedelia needed you to spell out what type of everything you refer to. “Please, Amelia Bedelia, bake a cake” becomes “bake (a function which takes a recipe for something bakeable) a cake (an instance of the cake class, which is a bakeable item).

Versus, “Bake a cake”.

See the point?

Now, if you told Amelia Bedelia “bake a shoe”, in a dynamic, strong typed system, she would look for the recipe for baking a shoe, and not finding one, would say, “I can’t do that. I can’t find the recipe for a shoe.”

Either way, the end result is the same. The question is when does Amelia Bedelia tell you she can’t do it: right after you tell her, or when she realizes she can’t do it.

But remember how hard it was to tell the strict/weakly typed Amelia Bedelia to bake a cake?

End.

Advertisements

The Code Development Lifecycle January 13, 2017

Posted by PythonGuy in Uncategorized.
add a comment
  1. Clearly identify the problem.
  2. Document the problem.
  3. Identify multiple solutions to the problem.
  4. Document the solutions.
  5. Choose the best solution.
  6. Document the reasons why you think the solution you chose is best.
  7. Write unit tests to demonstrate and reproduce the problem.
  8. Document the unit tests.
  9. Write integration tests to demonstrate and reproduce the problem.
  10. Document the integration tests.
  11. Correct the code to make the unit tests pass.
  12. Document the code.
  13. Code review for style and consistency.
  14. Deploy to integration system.
  15. Document the deployment.
  16. Run the code in the larger system against integration tests.
  17. When all tests pass, deploy to production.
  18. Document the deployment.

Notes:

  • Identifying the problem requires the art of science. Considering your observations, propose a theory. Try to disprove that theory with tests. The theory that survives all critical tests may be correct, but please don’t limit your imagination. As you gain more experience as a developer, you’re going to see more kinds of problems so you don’t have to be so imaginative.
  • Document everything. Why? Because it helps you move on with your life and it helps the poor schmuck who has to keep up with you or follow you.
  • Identify multiple solutions. If you only have one solution in mind, you have a bad imagination.
  • Choose the best solution. What is “best”? That depends on you and your team values. You should have a discussion with your team on what is truly important.
  • Unit tests test only one function, and not even the code that the function calls. Mock aggressively. (This is where dynamic, strong type systems shine best.)
  • Integration tests test that two systems interface properly. When you have two systems that interface properly, you have a new system that includes them both that needs to be tested with other systems. Integration tests take a long time to run and are usually quite complicated.
  • When you write a test, make sure it fails. If it doesn’t fail, it is a bad test.
  • You write as much code to make the tests pass, and no more. If you want to add more, you need to go back to step 1.
  • Code review will never catch bugs. Don’t try to catch bugs in code review. Instead, check that the developer has been keeping up best practices and ensure that this is code you want to maintain in the long run.

 

Static Typing January 13, 2017

Posted by PythonGuy in Uncategorized.
2 comments

One of the hottest debates in programming, even today, is typing. By that, of course, I mean variable types, not the sort of typing you do on the keyboard, although the editor wars are still raging. (ViM is the best by the way.)

I’d like to try and approach this discussion with some logic. Before I engage the logic muscles, though, let me announce that I have written thousands and thousands and maybe millions of lines of code. I don’t know. I have my keyboard I’ve been using for the past four years and the letter “s” is completely gone and “a” and a few others are on their way out. I am paid to write code, I am paid to make other people’s code work, and I am paid to tell other engineers how to write their code. I’m a senior engineer.

We had, about six months ago, a very lively debate about typing. And we decided to go with Python. I think that was the right decision. I have an opinion based on lots of experience, and I think I am right, even without any logic.

But let’s set that aside.

Let’s do this logically. Here are a list of logical statements.

  1. “Variable” is any named entity in your program that stores a “value”. It can be anything from integers, floats, strings, complex data structures like arrays or lists, and even functions and classes and modules and stacktraces.
  2. The “type” of a value tells the programmer and program alike how the value behaves. Certain behavior varies based on the type. For instance, you don’t add integers the same way as you add floats.
  3. “Strong typing” means you can tell what type a value is with no other information than the value itself. Python is an example of a “strongly typed” language, as every value is stored in memory as a PyObject, and the Python language can tell you the type of any value.
  4. “Weak typing” means you cannot tell what type a value is without additional information. C/C++ are good examples of this, as you could be looking at an int or a float or anything else. Without type declarations in the language itself, it would be impossible to keep things straight.
  5. In many languages, variables hold information on the type of the value they store. However, this is not true for all languages. For instance, in Python, the variable is simply a name-value pair, stored in a dict.
  6. “Static typing” means that you cannot change the type of the value in a variable. Some languages allow you to assign a derived class of the type of the variable, others are more strict.
  7. “Dynamic typing” means any variable can hold any value. Python is fully dynamic, but many languages are partially dynamic as they cannot store all values in a variable or some variables do contain type information.
  8. “Explicit typing” means the programmer must tell the computer what type each variable is.
  9. “Implicit typing” means the programmer does not tell the computer what type each variable is. The computer can infer the types by simple analysis.
  10. “Type system” is the way types are treated by a particular language, and includes the language used to describe its types.
  11. “Simple” and “complex” refer to the number of components and the number of sub-components in those components. IE, the function “foo(bar, baz)” is more complex than “foo(bar)” because it takes 2 parameters. (Parameters are sub-components of a function.)
  12. “Correct code” means that the code accomplishes the purpose it was intended to accomplish. “Incorrect code” means it is not correct. Note that simply because a program compiles does not mean it is correct. There must be some human element to judge the correctness of the program.
  13. Simple is better than complex, but the code must be correct for it to matter at all.
  14. Explicit is better than implicit sense it helps people unfamiliar with a system understand how it works.
  15. Explicit typing requires a type system that is explicit. That is, the programmer must spell out what the types are using the language the type system uses.
  16. Implicit typing merely hides the type system from the programmer. However, there is still a type system underneath that the programmer needs to be aware of when he violates the constraints of the system.
  17. Dynamically typed languages typically have a simpler type system than static type systems. All variables can be any type of value so there are no constraints like there are in static systems.
  18. Statically typed languages must have a more complicated type system. This is because it imposes at least one constraint: Variables cannot hold a different type of value.
  19. Weakly typed languages require static typing. This is because it is impossible to manage the values without knowing what types they are, and the values themselves do not contain that information.
  20. Strongly typed languages do not require static typing. This is because it is additional information that should at least be consistent with each other based on the type system.
  21. Complex systems are more difficult to understand and manipulate than simple systems.
  22. There is a class of error called “type mismatch errors”. They are introduced when the programmer creates incorrect code that improperly handles the values in question.
  23. Many static type systems eliminate or at least reduce type mismatch errors at compile time.
  24. Strong, dynamic type systems do not  eliminate or reduce type mismatch errors at compile time.
  25. Whether or not errors are caught at compile time or run time doesn’t matter as long as the errors are caught.
  26. In order to prove your software correct, you must demonstrate that it behaves as expected. Only the simplest of programs can be analyzed by reading the code.
  27. 26 requires writing what we call “tests”. The code is run against the test, and if it passes, then it is assumed the code is correct. If the test is not able to detect errors in the code, then it is not a sufficient test.
  28. 23 & 24 imply that the compiler is doing some of the tests that would be handled in the testing phase.
  29. Since explicit is better than implicit, implicitly testing the code with the compiler is worse than explicitly testing the code with tests.
  30. Therefore, dynamic, strong typing is best.

Addendum

This is really the argument “strong, dynamic type systems are much simpler than any other type system; The benefit of a more complicated type system is that only one kind of error is detected during the compilation phase rather than the test phase, but this is a very small benefit compared to the cost of the complexity of having a type system at all. Therefore, in all cases, strong, dynamic type systems are best.” I’ve just spelled out the assumptions and the logic behind it all.

I should mention that people who write code but do not write sufficient tests are cheating. Until you write tests, you cannot understand whether your code is right or wrong. The compiler can’t tell you anything. It exists to convert your code into machine instructions, that’s it. You still have to test those machine instructions for correctness.

With strong, dynamic systems you do not write tests that have already been written. For instance, I don’t need to write a test for what happens when you add a string and an integer in Python. Those tests already exist, and cover every possible combination of types. When a new type is introduced, it should include tests that would plug it into the ecosystem. IE, if you want it to have addition property, then you need to write the tests that will show it adds in some cases but not others.

Finally, I want to mention what I think is the most obvious evidence against typing systems. You know how we used to have a big zoo of fundamental particles until physicists were able to figure out quarks? See, if a system is composed of smaller systems, learn those smaller systems and ignore the bigger system, and you’ll understand the bigger system. Every sufficiently complicated type system has, inside of it, a dynamic, strong type system. The dynamic, strong type system is like quarks, and the more complicated type system is like that zoo of fundamental particles. If you really want to understand particle physics, study quarks, not protons and neutrons and all the other composite particles. If you want to do particle physics, you need to do quarks, not protons and neutrons and such. In this way, the strong, dynamic system is the only type system you ever need to learn. Once you’ve tamed that, your job is done.

Which is why I hardly ever see a type error in Python. The one case that seems to arise is when I have more than a few arguments to a function, and I forget the order. The solution is simple: Don’t use ordered arguments!