of_mdio: merge branch tails in of_phy_register_fixed_link()

Message ID 20170812210321.520045884@cogentembedded.com
State Accepted
Delegated to: David Miller
Headers show

Commit Message

Sergei Shtylyov Aug. 12, 2017, 9:03 p.m.
Looks  like gcc isn't always able to figure  out that 3 *if* branches in
of_phy_register_fixed_link() calling fixed_phy_register() at their ends
are similar enough and thus can be merged. The "manual" merge saves 40
bytes of the object code (AArch64 gcc 4.8.5), and still saves 12 bytes
even  if gcc was able to merge the branch tails (ARM gcc 4.8.5)...

Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>

---
The patch is against DaveM's 'net-next' repo.

 drivers/of/of_mdio.c |   23 +++++++++++------------
 1 file changed, 11 insertions(+), 12 deletions(-)

Comments

David Miller Aug. 14, 2017, 3:09 a.m. | #1
From: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Date: Sun, 13 Aug 2017 00:03:06 +0300

> Looks  like gcc isn't always able to figure  out that 3 *if* branches in
> of_phy_register_fixed_link() calling fixed_phy_register() at their ends
> are similar enough and thus can be merged. The "manual" merge saves 40
> bytes of the object code (AArch64 gcc 4.8.5), and still saves 12 bytes
> even  if gcc was able to merge the branch tails (ARM gcc 4.8.5)...
> 
> Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>

Applied, but if two instances of the "same" compiler just with
different targets changes the optimization, it could be because of a
tradeoff which is specific to parameters expressed in that target's
backend.

So in the future we should probably back away from trying to "help"
the compiler in this way.
David Laight Aug. 15, 2017, 11:18 a.m. | #2
From: David Miller
> Sent: 14 August 2017 04:09
> From: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
> Date: Sun, 13 Aug 2017 00:03:06 +0300
> 
> > Looks  like gcc isn't always able to figure  out that 3 *if* branches in
> > of_phy_register_fixed_link() calling fixed_phy_register() at their ends
> > are similar enough and thus can be merged. The "manual" merge saves 40
> > bytes of the object code (AArch64 gcc 4.8.5), and still saves 12 bytes
> > even  if gcc was able to merge the branch tails (ARM gcc 4.8.5)...
> >
> > Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
> 
> Applied, but if two instances of the "same" compiler just with
> different targets changes the optimization, it could be because of a
> tradeoff which is specific to parameters expressed in that target's
> backend.
> 
> So in the future we should probably back away from trying to "help"
> the compiler in this way.

Probably a trade off between code size and execution speed.
I've had 'fun' trying to stop gcc merging tail code paths
in order to avoid the cost of the branch instruction.

	David

Patch

Index: net-next/drivers/of/of_mdio.c
===================================================================
--- net-next.orig/drivers/of/of_mdio.c
+++ net-next/drivers/of/of_mdio.c
@@ -422,16 +422,13 @@  int of_phy_register_fixed_link(struct de
 	struct fixed_phy_status status = {};
 	struct device_node *fixed_link_node;
 	u32 fixed_link_prop[5];
-	struct phy_device *phy;
 	const char *managed;
-	int link_gpio;
+	int link_gpio = -1;
 
-	if (of_property_read_string(np, "managed", &managed) == 0) {
-		if (strcmp(managed, "in-band-status") == 0) {
-			/* status is zeroed, namely its .link member */
-			phy = fixed_phy_register(PHY_POLL, &status, -1, np);
-			return PTR_ERR_OR_ZERO(phy);
-		}
+	if (of_property_read_string(np, "managed", &managed) == 0 &&
+	    strcmp(managed, "in-band-status") == 0) {
+		/* status is zeroed, namely its .link member */
+		goto register_phy;
 	}
 
 	/* New binding */
@@ -454,8 +451,7 @@  int of_phy_register_fixed_link(struct de
 		if (link_gpio == -EPROBE_DEFER)
 			return -EPROBE_DEFER;
 
-		phy = fixed_phy_register(PHY_POLL, &status, link_gpio, np);
-		return PTR_ERR_OR_ZERO(phy);
+		goto register_phy;
 	}
 
 	/* Old binding */
@@ -466,11 +462,14 @@  int of_phy_register_fixed_link(struct de
 		status.speed  = fixed_link_prop[2];
 		status.pause  = fixed_link_prop[3];
 		status.asym_pause = fixed_link_prop[4];
-		phy = fixed_phy_register(PHY_POLL, &status, -1, np);
-		return PTR_ERR_OR_ZERO(phy);
+		goto register_phy;
 	}
 
 	return -ENODEV;
+
+register_phy:
+	return PTR_ERR_OR_ZERO(fixed_phy_register(PHY_POLL, &status, link_gpio,
+						  np));
 }
 EXPORT_SYMBOL(of_phy_register_fixed_link);